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FATTY ACID ELiONGASES 
Field of the Invention 
5 This invention relates to fatty acid elongase 

complexes and nucleic acids encoding elongase proteins. 
More particularly, the invention relates to nucleic acids 
encoding S-keto acyl synthase proteins that are effective 
for producing very long chain fatty acids , . polypeptides 
10 produced from such nucleic acids and transgenic plants 
expressing such nucleic acids. 

Background of the Invention 
Plants are known to synthesize very long chain 
fatty acids (VLCFAs) . VLCFAs are saturated or 
15 unsaturated monocarboxylic acids with an unbranched even- 
numbered carbon chain that is greater than 18 carbons in 
length. Many VLCFAs are 20-32 carbons in length, but 
VLCFAs can be up to 60 carbons in length. Important 
VLCFAs include erucic acid (22:1, i.e., a 22 carbon chain 
20 with one double bond), nervonic acid (24:1), behenic acid 
(22:0), and arachidic acid (20:0). 

Plant seeds accumulate mostly 16- and 18 -carbon 
fatty acids. VLCFAs are not desirable in edible oils. 
Oilseeds of the Crucifereae (e.g., rapeseed) and a few 

2 5 other plants, however, accumulate C2 0 and C22 fatty acids 

(FAs) . Although plant breeders have developed rapeseed 
lines that have low levels of VLCFAs for edible oil 
purposes, even lower levels would be desirable. On the 
other hand, vegetable oils having elevated levels of 

3 0 VLCFAs are desirable for certain industrial uses, 

including uses as lubricants, fuels and as a feedstock 
for plastics, pharmaceuticals and cosmetics. 
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The biosynthesis of saturated fatty acids up to an 
18 -carbon chain occurs in the chloroplast. C2 units from 
acyl thioesters are linked sequentially, beginning with 
the condensation of acetyl Coenzyme A (CoA) and malonyl 
5 acyl carrier protein (ACP) to form a C4 acyl fatty acid. 
This condensation reaction is catalyzed by a g-ketoacyl 
synthase III (KASIII) . 6-ketoacyl moieties are also 
referred to as 3-ketoacyl moieties. 

The enzyme £-ketoacyl synthase I (KASI) is 
10 involved in the addition of C2 groups to form the C6 to 
C16 saturated fatty acids. KASI catalyzes the stepwise 
condensation of a fatty acyl moiety (C4 to C14) with 
malonyl -ACP to produce a 3 -ketoacyl-ACP product that is 2 
carbons longer than the substrate. The last condensation 
15 reaction in the chloroplast, converting C16 to C18, is 
catalyzed by fi-ketoacyl synthase II (KASII) . 

Each elongation cycle involves three additional 
enzymatic steps in addition to the condensation reaction 
as discussed above. Briefly, the S-ketoacyl condensation 

2 0 product is reduced to fe-hydroxyacyl-ACP, dehydrated to 

the enoyl-ACP, and finally reduced to a fully reduced 
acyl -ACP. The fully reduced fatty acyl -ACP reaction 
product then serves as the substrate for the next cycle 
of elongation . 

25 The C18 saturated fatty acid (stearic acid, 18:0) 

can be transported out of the chloroplast and converted 
to the monounsaturate C18:l (oleic acid), and the 
polyunsaturates C18:2 (linoleic acid) and C18:3 (of- 
linolenic acid). C18:0 and C18:l can also be elongated 

30 outside the chloroplast to form VLCFAs . The formation of 
VLCFAs involves the sequential condensation of two carbon 
groups from malonyl CoA with a C18:0 or C18:l fatty acid 
substrate. Elongation of fatty acids longer than 18 
carbons depends on the activity of a fatty acid elongase 

3 5 complex to carry out four separate enzyme reactions 
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similar to those described above for fatty acid synthesis 
in the chloroplast . Fehling, Biochem. Biophys. Acta 
1082:239-246 (1991). In plants, elongase complexes are 
distinct from fatty acid synthases since elongases are 
5 extraplastidial and membrane bound. 

Mutations have been identified in an Arabidopsis 
gene associated with fatty acid elongation. This gene, 
designated the FAE1 gene, is involved in the condensation 
step of an elongation cycle. See, WO 96/13582, 

10 incorporated herein by reference. Plants carrying a 

mutation in FAE1 have significant decreases in the levels 
of VLCFAs in seeds. Genes associated with wax 
biosynthesis in jojoba have also been cloned and 
sequenced (WO 95/15387, incorporated herein by 

15 reference) . 

Very long chain fatty acids are key components of 
many biologically important compounds in animals, plants, 
and microorganisms. For example, in animals, the VLCFA 
arachidonic acid is a precursor to many prostaglandins. 

2 0 In plants VLCFAs are major constituents of 

triacylglycerols in many seed oils, are essential 
precursors for cuticular wax production, and are utilized 
in the synthesis of glycosylceramides, an important 
component of the plasma membrane. 

25 Obtaining detailed information on the biochemistry 

of KAS enzymes has been hampered by the difficulties 
encountered when purifying membrane bound enzymes. 
Although elongase activities have been partially purified 
from a number of sources, or studied using cell 

30 fractions, the elucidation of the biochemistry of 

elongase complexes has been hampered by the complexity of 
the membrane fractions used as the enzyme source. For 
example, until recently, it was unclear as to whether 
plant elongase complexes were composed of a 

35 multifunctional polypeptide similar to the FAS found in 
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animals and yeast, or if the complexes existed as 
discrete and dissociable enzymes similar to the FAS of 
plants and bacteria. Partial purification of an elongase 
KAS, immunoblot identification of the hydroxy acyl 
5 dehydrase, and the recent cloning of a KAS gene (FAE1) 
suggest that the enzyme activities of elongase complexes 
exist on individual enzymes. 



Summary of the Invention 
The invention disclosed herein relates to an 

10 isolated polynucleotide selected from one of the 

following: SEQ ID N0:1; SEQ ID NO : 3 ; SEQ ID NO : 5 ; SEQ ID 
NO: 7; SEQ ID NO : 9 ; SEQ ID NO; 11; SEQ ID NO: 13; an RNA 
analog of SEQ ID NO:l, 3, 5, 7, 9, 11, 13, or 15; and a 
polynucleotide having a nucleic acid sequence 

15 complementary to one of the above. The polynucleotide 
can also be a nucleic acid fragment of one of the above 
sequences that is at least 15 nucleotides in length and 
that hybridizes under stringent conditions to genomic DNA 
encoding the polypeptide of SEQ ID N0:2, SEQ ID NO : 4 , SEQ 

20 ID NO:6, SEQ ID NO : 8 , SEQ ID NO:10, SEQ ID NO: 12, or SEQ 
ID NO: 14. 

Also disclosed herein is an isolated polypeptide 
that has an amino acid sequence substantially identical 
to one of the following: SEQ ID NO:2, SEQ ID NO: 4, SEQ ID 

25 NO: 6, SEQ ID NO : 8 , SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID 
NO: 14. Also disclosed are isolated polynucleotides 
encoding polypeptides substantially identical in their 
amino acid sequence to: SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID 
NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID 

3 0 NO: 14. 

The invention also relates to a transgenic plant 
containing a nucleic acid construct. The nucleic acid 
construct comprises a polynucleotide described above* 
The construct further comprises a regulatory element 
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operably linked to the polynucleotide. The regulatory 
element may a tissue-specific promoter, for example, an 
epidermal cell-specific promoter or a seed-specific 
promoter. The regulatory element may be operably linked 
5 to the polynucleotide in sense or antisense orientation. 
The plant has altered levels of very long chain fatty 
acids in tissues where the polynucleotide is expressed, 
compared to a parental plant lacking the nucleic acid 
construct . 

10 A method is disclosed for altering the levels of 

very long chain fatty acids in a plant . The method 
comprises the steps of creating a nucleic acid construct 
and introducing the construct into the plant. The 
construct includes a polynucleotide selected from one of 

15 the following: SEQ ID NO : 1 ; SEQ ID NO: 3; SEQ ID NO : 5 ; SEQ 
ID N0:7; SEQ ID NO : 9 ; SEQ ID NO : 11 ; SEQ ID N0:13; an RNA 
analog of SEQ ID NO:l, 3, 5, 7, 9, 11, 13, or 15; and a 
polynucleotide having a nucleic acid sequence 
complementary to one of the above. The polynucleotide 

2 0 can also be a nucleic acid fragment of one of the above 

that is at least 15 nucleotides in length and that 
hybridizes under stringent conditions to genomic DNA 
encoding the polypeptide of SEQ ID NO: 2, SEQ ID NO:4, SEQ 
ID NO: 6, SEQ ID NO : 8 , SEQ ID NO: 10, SEQ ID NO: 12, or SEQ 
25 ID NO:14. The polynucleotide is effective for altering 
the levels of very long chain fatty acids in the plant . 

Other features and advantages of the invention 
will be apparent from the following description of the 
preferred embodiments thereof, and from the claims. 

3 0 Brief Description of the Drawings 

Figure 1 shows the time course of in vitro VLCFA 
synthesis by FAE1 expressed in yeast, with 3 different 
acyl-CoA substrates . 
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Figure 2 shows the rates of in vitro VLCFA 
synthesis and the VLCFA profiles of FAE1 expressed in 
yeast, with 3 different acyl-CoA substrates. 

Figure 3 shows the nucleotide sequence of the 
5 coding region of the Arabidopsis ELI polynucleotide (SEQ 
ID N0:1) • 

Figure 4 shows the deduced amino acid sequence 
(SEQ ID NO: 2) for the ELI coding sequence of Figure 3. 

Figure 5 shows the nucleotide sequence of the 
10 coding region of the Arabidopsis EL2 polynucleotide (SEQ 
ID NO: 3) . 

Figure 6 shows the deduced amino acid sequence 
(SEQ ID NO: 4) for the EL2 coding sequence of Figure 5. 

Figure 7 shows the nucleotide sequence of the 
15 coding region of the Arabidopsis EL3 polynucleotide (SEQ 
ID NO: 5) . 

Figure 8 shows the deduced amino acid sequence 
(SEQ ID NO: 6) for the EL3 coding sequence of Figure 7. 
Figure 9 shows the nucleotide sequence of the 

2 0 coding region of the Arabidopsis EL4 polynucleotide (SEQ 

ID NO:7) . 

Figure 10 shows the deduced amino acid sequence 
(SEQ ID NO: 8) for the EL4 coding sequence of Figure 9. 

Figure 11 shows the nucleotide sequence of the 
25 coding region of the Arabidopsis EL5 polynucleotide (SEQ 
ID NO: 9) . 

Figure 12 shows the deduced amino acid sequence 
(SEQ ID NO: 10) for the EL5 coding sequence of Figure 11. 
Figure 13 shows the nucleotide sequence of the 

3 0 coding region of the Arabidopsis EL 6 polynucleotide (SEQ 

ID NO: 11) . 

Figure 14 shows the deduced amino acid sequence 
(SEQ ID NO: 12) for the EL6 coding sequence of Figure 13. 
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Figure 15 shows the nucleotide sequence of the 
coding region of the Arabidopsis EL7 polynucleotide (SEQ 
ID NO: 13) . 

Figure 16 shows the deduced amino acid sequence 
5 (SEQ ID NO: 14) for the EL7 coding sequence of Figure 15. 



Description of the Preferred Embodiments 
The present invention comprises isolated nucleic 
acids (polynucleotides) that encode polypeptides having 
K-ketoacyl synthase activity. The novel polynucleotides 

10 and polypeptides of the invention are involved in the 
synthesis of very long chain fatty acids and are useful 
for modulating the total amounts of such fatty acids and 
the specific VLCFA profile in plants. 

A polynucleotide of the invention may be in the 

15 form of RNA or in the form of DNA, including cDNA, 

synthetic DNA or genomic DNA. The DNA may be double- 
stranded or single- stranded , and if single -stranded, can 
be either the coding strand or non- coding strand. An RNA 
analog may be, for example, mRNA or a combination of 

20 ribo- and deoxyribonucleotides . Illustrative examples of 
a polynucleotide of the invention are shown in Figs. 3, 
5, 7, 9, 11, 13 and 15. 

A polynucleotide of the invention typically is at 
least 15 nucleotides (or base pairs, bp) in length. In 

25 some embodiments, a polynucleotide is about 20 to 100 

nucleotides in length, or about 100 to 500 nucleotides in 
length. In other embodiments, a polynucleotide is 
greater than about 1500 nucleotides in length and encodes 
a polypeptide having the amino acid sequence shown in 

30 Figs. 4, 6, 8, 10, 12, 14 or 16. 

In some embodiments, a polynucleotide of the 
invention encodes analogs or derivatives of a polypeptide 
having the deduced amino acid sequence of Figs. 4, 6, 8, 
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10, 12, 14 or 16. Such fragments, analogs on derivatives 
include, for example, naturally occurring allelic 
variants, non-naturally occurring allelic variants, 
deletion variants and insertion variants, that do not 
5 substantially alter the function of the polypeptide. 

A polynucleotide of the invention may further 
comprise additional nucleic acids. For example, a 
nucleic acid fragment encoding a secretory or leading 
amino acid sequence can be fused in- frame to the amino 

10 terminal end of one of the ELI through EL 7 polypeptides. 
Other nucleic acid fragments are known in the art that 
encode amino acid sequences useful for fusing in- frame to 
the KAS polypeptides disclosed herein. See, e.g., U.S. 
5,629,193 incorporated herein by reference. A 

15 polynucleotide may further comprise one or more 
regulatory elements operably linked to a KAS 
polynucleotide disclosed herein. 

The present invention also comprises 
polynucleotides that hybridize to a KAS polynucleotide 

20 disclosed herein. Such a polynucleotide typically is at 
least 15 nucleotides in length. Hybridization typically 
involves Southern analysis (Southern blotting) , a method 
by which the presence of DNA sequences in a target 
nucleic acid mixture are identified by hybridization to a 

25 labeled oligonucleotide or DNA fragment probe. Southern 
analysis typically involves electrophoretic separation of 
DNA digests on agarose gels, denaturation of the DNA 
after electrophoretic separation, and transfer of the DNA 
to nitrocellulose, nylon, or another suitable membrane 

30 support for analysis with a radiolabeled, biotinylated, 
or enzyme -labeled probe as described in sections 9.37- 
9.52 of Sambrook et al . , (1989) Molecular Cloning, second 
edition, Cold Spring Harbor Laboratory, Plainview; NY. 

A polynucleotide can hybridize under moderate 

35 stringency conditions or, preferably, under high 
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stringency conditions to a KAS polynucleotide disclosed 
herein. High stringency conditions are used to identify 
nucleic acids that have a high degree of homology to the 
probe. High stringency conditions can include the use of 
5 low ionic strength and high temperature for washing, for 
example, 0.015 M NaCl/0.0015 M sodium citrate (0.1X BSC); 
0.1% sodium lauryl sulfate (SDS) at 65 °C. Alternatively, 
a denaturing agent such as formamide can be employed 
during hybridization, e.g., 50% formamide with 0.1% 

10 bovine serum albumin/0.1% Ficoll/0.1% 

polyvinylpyrrolidone/5 0 mM sodium phosphate buffer at pH 
6.5 with 750 mM NaCl , 75 mM sodium citrate at 42°C. 
Another example is the use of 50% formamide, 5 x SSC 
(0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium 

15 phosphate <pH 6.8), 0.1% sodium pyrophosphate, 5 x 
Denhardt's solution, sonicated salmon sperm DNA (50 
/xg/ml) , 0.1% SDS, and 10% dextran sulfate at 42°C, with 
washes at 42 °C in 0.2 x SSC and 0.1% SDS. 

Moderate stringency conditions refers to 

2 0 hybridization conditions used to identify nucleic acids 

that have a lower degree of identity to the probe than do 
nucleic acids identified under high stringency 
conditions. Moderate stringency conditions can include 
the use of higher ionic strength and/or lower 

25 temperatures for washing of the hybridization membrane, 
compared to the ionic strength and temperatures used for 
high stringency hybridization. For example, a wash 
solution comprising 0.060 M NaCl/0.0060 M sodium citrate 
(4X SSC) and 0.1% sodium lauryl sulfate (SDS) can be used 

30 at 50°C, with a last wash in IX SSC, at 65°C. 

Alternatively, a hybridization wash in IX SSC at 37°C can 
be used. 

Hybridization can also be done by Northern 
analysis (Northern blotting) , a method used to identify 

3 5 RNAs that hybridize to a known probe such as an 
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oligonucleotide, DNA fragment, cDNA or fragment thereof, 
or RNA fragment. The probe is labeled with a 
radioisotope such as 32 P, by biotinylation or with an 
enzyme. The RNA to be analyzed can be usually 
5 electrophoretically separated on an agarose or 

polyacryl amide gel, transferred to nitrocellulose, nylon, 
or other suitable membrane, and hybridized with the 
probe, using standard techniques well known in the art 
such as those described in sections 7.3 9-7.52 of Sambrook 

10 et. al . , supra. 

A polynucleotide has at least about 70% sequence 
identity, preferably at least about 8 0% sequence 
identity, more preferably at least about 90% sequence 
identity to SEQ ID NO:l, 3, 5, 7, 9, 11, or 13 . Sequence 

15 identity can be determined, for example, by computer 

programs designed to perform single and multiple sequence 
alignments . 

A polynucleotide of the invention can be obtained 
by chemical synthesis, isolation and cloning from plant 

2 0 genomic DNA or other means known to the art, including 

the use of PCR technology carried out using 
oligonucleotides corresponding to portions of SEQ ID 
NO:l, 3, 5, 7-9, 11 or 13. Polymerase chain reaction 
(PCR) refers to a procedure or technique in which target 
25 nucleic acid is amplified in a manner similar to that 
described in U.S. Patent No. 4,683,195, incorporated 
herein by reference, and subsequent modifications of the 
procedure described therein. Generally, sequence 
information from the ends of the region of interest or 

3 0 beyond is employed to design oligonucleotide primers that 

are identical or similar in sequence to opposite strands 
of the template to be amplified. PCR can be used to 
amplify specific RNA sequences, specific DNA sequences 
from total genomic DNA, and cDNA transcribed from total 
35 cellular RNA, bacteriophage or plasmid sequences, and the 
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like. Alternately, a cDNA library (in an expression 
vector) can be screened with KAS-specific antibody 
prepared using peptide sequence (s) from hydrophilic 
regions of the KAS protein of SEQ ID NO: 2 and technology 
5 known in the art . 

A polypeptide of the invention comprises an 
isolated polypeptide having the deduced amino acid 
sequence of Figs. 2, 4, 6, 8, 10 and 12, as well as 
derivatives and analogs thereof. By "isolated" is meant 

10 a polypeptide that is expressed and produced in an 
environment other than the environment in which the 
polypeptide is naturally expressed and produced. For 
example, a plant polypeptide is isolated when expressed 
and produced in bacteria or fungi. Similarly, a plant 

15 polypeptide is isolated when its gene coding sequence is 
operably linked to a chimeric regulatory element and 
expressed in a tissue where the polypeptide is not 
naturally expressed. A polypeptide of the invention also 
comprises variants of the KAS polypeptides disclosed 

2 0 herein, as discussed above. 

A full-length KAS coding sequence may comprise the 
sequence shown in SEQ ID NO:l, 3, 5, 7, 9, 11 or 13. 
Alternatively, a chimeric full-length KAS coding sequence 
may be formed by linking, in-frame, nucleotides from the 

25 5' region of a first KAS gene to nucleotides from the 3' 
region of a second KAS gene, thereby forming a chimeric 
KAS protein. 

It should be appreciated that nucleic acid 
fragments having a nucleotide sequence other than the KAS 

30 sequences disclosed in SEQ ID NO:l, 3, 5, 7, 9, 11 or 13 
will encode a polypeptide having the exemplified amino 
acid coding sequence of SEQ ID NO:2, 4, 6, 8, 10, 12 or 
14, respectively. The degeneracy of the genetic code is 
well-known to the art; i.e., for many amino acids, there 
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is more than one nucleotide triplet which serves as the 
codon for the amino acid. 

It should also be appreciated that certain amino 
acid substitutions can be made in protein sequences 
5 without affecting the function of the protein. 

Generally, conservative amino acid substitutions or 
substitutions of similar amino acids are tolerated 
without affecting protein function. Similar amino acids 
can be those that are similar in size and/or charge 

10 properties, for example, aspartate and glutamate and 
isoleucine and valine are both pairs of similar amino 
acids. Similarity between amino acid pairs has been 
assessed in the art in a number of ways. For example, 
Dayhoff et al . (1978) in Atlas of Protein Sequence and 

15 Structure, Vol. 5, Suppl . 3, pp. 345-352, which is 
incorporated by reference herein, provides frequency 
tables for amino acid substitutions which can be employed 
as a measure of amino acid similarity. 

A nucleic acid construct of the invention 

2 0 comprises a polynucleotide as disclosed herein linked to 

another, different polynucleotide. For example, a full- 
length KAS coding sequence may be operably fused in- frame 
to a nucleic acid fragment that encodes a leader 
sequence, secretory sequence or other additional amino 
25 acid sequences that amy be usefully linked to a 
polypeptide or peptide fragment. 

A transgenic plant of the invention contains a 
nucleic acid construct as described herein. In some 
embodiments, a transgenic plant contains a nucleic acid 

3 0 construct that comprises a polynucleotide of the 

invention operably linked to at least one suitable 
regulatory sequence in sense orientation. Regulatory 
sequences typically do not themselves code for a gene 
product. Instead, regulatory sequences affect the 
3 5 expression level of the polynucleotide to which they are 
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linked. Examples of regulatory sequences are known in 
the art and include, without limitation, minimal 
promoters and promoters of genes preferentially or 
exclusively expressed in seeds or in epidermal cells of 
5 stems and leaves. Native regulatory sequences of the 
polynucleotides disclosed herein can be readily isolated 
by those skilled in the art and used in constructs of the 
invention. Other examples of suitable regulatory 
sequences include enhancers or enhancer- like elements, 

10 introns, 3' non- coding regions such as poly A sequences 
and other regulatory sequences discussed herein. 
Molecular biology techniques for preparing such chimeric 
genes are known in the art . 

In other embodiments, a transgenic plant contains 

15 a nucleic acid construct comprising a partial or a full- 
length KAS coding sequence operably linked to at least 
one suitable regulatory sequence in antisense 
orientation. The chimeric gene can be introduced into a 
plant and transgenic progeny displaying expression of the 

2 0 antisense construct are identified. 

One may use a polynucleotide disclosed herein for 
cosuppression as well as for antisense inhibition. 
Cosuppression of genes in plants may be achieved by 
expressing, in the sense orientation, the entire or 
25 partial coding sequence of a gene. See, e.g., WO 
04\11516, incorporated herein by reference. 

Transgenic techniques for use in the invention 
include, without limitation, AgrroJbacterium- mediated 
transformation, viral vector-mediated transformation 

3 0 electroporation and particle gun transformation. 

Illustrative examples of transformation techniques are 
described in U.S. Patent 5,204,253, (particle gun) and 
U.S. Patent 5,188,958 (Agrobacterium) , incorporated 
herein by reference. Transformation methods utilizing 
3 5 the Ti and Ri plasmids of Agrobcicterixim spp. typically 
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use binary-type vectors. Walkerpeach, C. et al . , in 
Plant Molecular Biology Manual, S. Gelvin and R. 
Schilperoort , eds. , Kluwer Dordrecht, Cl:l-19 (1994). If 
cell or tissue cultures are used as the recipient tissue 
5 for transformation, plants can be regenerated from 

transformed cultures by techniques known to those skilled 
in the art . 

Techniques are known for the introduction of DNA 
into monocots as well as dicots, as are the techniques 

10 for culturing such plant tissues and regenerating those 
tissues. Monocots which have been successfully 
transformed and regenerated include wheat, corn, rye, 
rice, and asparagus. See, e.g., U.S. Patent Nos . 
5,484,956 and 5,550,318, incorporated herein by 

15 reference. 

For efficient production of transgenic plants from 
plant cells, it is desirable that the plant tissue used 
for transformation possess a high capacity for 
regeneration. Transgenic plants of woody species such as 

20 poplar and aspen have also been obtained. Technology is 
also available for the manipulation, transformation, and 
regeneration of gymnosperm plants. For example, U.S. 
Patent No. 5,122,466 describes the biolistic 
transformation of conifers, with preferred target tissue 

25 being meristematic and cotyledon and hypocotyl tissues. 

U,S. Patent No. 5,041,382 describes enrichment of conifer 
embryonal cells. 

Seeds produced by a transgenic plant (s) can be 
grown and then selfed (or outcrossed and selfed) to 

30 obtain seeds homozygous for the construct. Seeds can be 
analyzed in order to identify those homozygotes having 
the desired expression of the construct. Transgenic 
plants may be entered into a breeding program, e.g., to 
introgress the novel construct into other lines, to 

35 transfer the construct to other species, or for further 



BNSDOCID; <WO 9854954A1_IA> 



WO 98/54954 



PCT/US98/11384 



- 15 - 

selection of other desirable traits. Alternatively, 
transgenic plants may be propagated vegetatively for 
those species amenable to such techniques. A nucleic 
acid construct of the invention can alter the levels of 
5 very long chain fatty acids in plant tissues expressing 
the polynucleotide, compared to VLCFA levels in 
corresponding tissues from an otherwise identical plant 
not expressing the polynucleotide. A comparison can be 
made, for example, between a non- transgenic plant of a 

10 plant line and a transgenic plant of the same plant line. 
Levels of VLCFAs having 2 0-32 carbons and/or levels of 
VLCFAs having 32-60 carbons can be altered in plants 
disclosed herein. Plants having an altered VLCFA 
composition may be identified by techniques known to the 

15 skilled artisan, e.g., thin layer chromatography or gas- 
liquid chromatography (GLC) analysis of the appropriate 
plant tissue. 

A suitable group of plants with which to practice 
the invention are the Bxrassica species, including B. 

2 0 napus, B . rapa, B.juncea, and J5. hirta. Other suitable 

plants include, without limitation, soybean (Glycine 
max) , sunflower (Helianthus annuus) and corn (Zea mays) . 

A method according to the invention comprises 
introducing a nucleic acid construct into a plant cell 
25 and producing a plant (as well as progeny of such a 
plant) from the transformed cell. Progeny includes 
descendants of a particular plant or plant line, e.g., 
seeds developed on an instant plant are descendants. 
Progeny of an instant plant include seeds formed on F x/ 

3 0 F 2 , F 3 , and subsequent generation plants, or seeds formed 

on BC lt BC 2 , BC 3 , and subsequent generation plants. 

Methods and compositions according to the 
invention are useful in that the resulting plants and 
plant lines have desirable alterations in very long chain 
3 5 fatty acid composition. Suitable tissues in which to 
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express polynucleotides and/or polypeptides of the 
invention include, without limitation, seeds, stems and 
leaves. Leaf tissues of interest include cells and 
tissues of the epidermis, e.g., cells that are involved 
5 in forming trichomes. Of particular interest are 

epidermal cells involved in forming the cuticular layer. 
The cuticular layer comprises various very long chain 
fatty acids and VLCFA derivatives such as alkanes, 
esters, alcohols and aldehydes. Altering the composition 

10 and amount of VLCFAs in epidermal cells and tissues may 
enhance defense mechanisms and drought tolerance of 
plants disclosed herein. 

Polynucleotides of the invention can be used as 
markers in plant genetic mapping and plant breeding 

15 programs. Such markers may include RFLP, RAPD, or PCR 
markers, for example. Marker-assisted breeding 
techniques may be used to identify and follow a desired 
fatty acid composition during the breeding process. 
Marker-assisted breeding techniques may be used in 

2 0 addition to, or as an alternative to, other sorts of 

identification techniques. An example of marker-assisted 
breeding is the use of PCR primers that specifically 
amplify a sequence from a desired KAS that has been 
introduced into a plant line and is being crossed into 
25 other plant lines. 

Plants and plant lines disclosed herein preferably 
have superior agronomic properties. Superior agronomic 
characteristics include, for example, increased seed 
germination percentage, increased seedling vigor, 

3 0 increased resistance to seedling fungal diseases (damping 

off, root rot and the like) , increased yield, and 
i mp rove d s t andab i 1 i t y . 

While the invention is susceptible to various 
modifications and alternative forms, certain specific 
3 5 embodiments thereof are described in the general methods 
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and examples set forth below. It should be understood, 
however, that these examples are not intended to limit 
the invention to the particular forms disclosed but, 
instead the invention is to cover all modifications, 
5 equivalents and alternatives falling within the scope of 
the invention. 



EXAMPLES 
Example 1 

Cloning and Expression of FAE1 in Yeast Cells 

10 The open reading frame of the Arabidopsis FAE1 

gene was amplified directly by PCR, using Arabidopsis 
thaliana cv. Columbia genomic DNA as a template, pfu DNA 
polymerase and the following primers: 
5 ' CTCGAGGAGCAATGACGTCCGTTAA- 3 ' and 5 ' - 

15 CTCGAGTTAGGACCGACCGTTTTG-3 ' . The PCR product was blunt - 
end cloned into the Eco RV site of pBluescript 
(Stratagene, La Jolla, CA) , 

The FAE1 gene was excised from the Bluescript 
vector with BamHl, and then subcloned into the pYEUra3 

2 0 (Clontech, Palo Alto, CA) . pYEUra3 is a yeast 

centromere -containing, episomal plasmid that is 
propagated stably through cell division. The FAE1 ge 
was inserted downstream of a GAL1 promoter in pYEUra 
The GAL1 promoter is induced when galactose is pres 
25 the medium and repressed when glucose is present ii 
growth medium . 

Insertion of the FAE1 gene in the sense 
orientation was confirmed by PCR, and pYEUra3 / FA* 
used to transform Saccharomyces cerevisiae straj 3 

3 0 using a lithium acetate procedure as described 

R. and Woods, R. , in Molecular Genetics of Yea' 
Practical Approaches, Oxford Press, pp. 121-17 
Plasmid DNA was isolated from putative transf 
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the presence of the FAE1 / p YEUr a 3 construct was confirmed 
by Southern analysis. 

Yeast transformed with pYEUra3 having FAE1 
operably linked to the GAL1 promoter were grown in the 
5 presence of galactose or glucose and were analyzed for 
the expression of FAE1 . As a control, yeast transformed 
with pYEUra3 containing no insert were also assayed. 
Analysis of such control preparations yielded fatty acid 
compositions and fatty acid elongation rates similar to 

10 those of yeast transformed with pYEUra3/FAEl and grown 
with glucose as the carbon source. 

The fatty acid composition of yeast cells grown in 
the presence of galactose was compared to that of cells 
grown in the presence of glucose, to determine if VLCFA 

15 were found in the galactose- induced cells. 

Transformed yeast cells were grown overnight in 
YPD media at 3 0 °C with vigorous shaking. One hundred jxl 
of the overnight culture were used to inoculate 4 0 ml of 
complete minimal uracil dropout media (CM-Ura) 

20 supplemented with either glucose or galactose (2% w/v) . 
Cultures were grown at 30 °C to an OD 600 of approximately 
1.3 to 1.5. Cells were harvested by centrif ugation at 
5000 Xg for 10 min. Total lipids were extracted from the 
cells with 2 volumes of 4N KOH in 100% methanol for 60 

2 5 min. at 8 0°C. Fatty acids were saponified and methyl 

esters were prepared by drying the samples and 
resuspending in 0.5 ml of boron trichloride in methanol 
(10% v/v) . Samples were incubated at 50 °C for 15 min in 
a sealed tube. About 2 ml of water was then added and 

3 0 the fatty methyl esters were extracted thrice with 1 ml 

of hexane. Extracts were dried under nitrogen and 
redissolved in hexane. See Hlousek-Radojcic , A. et al . , 
Plant J. 8:8 03-809. Methyl esters were analyzed on an HP 
5890 series II gas chromatograph equipped with a 5771MSD 
35 and 7673 auto injector (Hewlett-Packard, Cincinnati, OH) . 
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Methyl esters were separated on a DB-23 (iT&W Scientific) 
capillary column (30 m X 0.25 mm X 0.25 fim) . The column 
was operated with helium carrier gas and splitless 
injection (injection temperature 280 °C, detector 
5 temperature 280°C) . After an initial 3 min. at 100°C, 
the oven temperature was raised to 2 50° at 2 0 °C min" 1 and 
maintained at that temperature for an additional 3 min. 
The identity of the peaks was verified by 
cochromatography with authentic standards and by mass 

10 spectrometer analysis. 

The results clearly revealed the appearance of 
both 20:1 and 22:1 acyl-CoA products in galactose- induced 
yeast containing the FAE1 coding sequence . Uninduced 
yeast cells failed to accumulated significant amounts of 

15 fatty acids longer than C18. These results indicate that 
expression of FAE1 in yeast resulted in functional KAS 
activity and functional elongase activity. 

Example 2 
FAE1 Activity in Yeast Microsomes 

2 0 The functional expression of the FAE1 KAS was 

analyzed by isolating microsomes from transformed yeast 
cells and assaying these microsomes in vitro for elongase 
activity. 

Transformed yeast cells were grown in the presence 
25 of either glucose or galactose (2% w/v) as described in 
Example 1. Cells were harvested by centrif ugation at 
5000 Xg for 10 min and washed with 10 ml ice cold 
isolation buffer (IB) , which contains 80 mM Hepes-KOH, pH 
7.2, 5 mM EGTA, 5 mM EDTA, 10 mM KC1 , 32 0 mM sucrose and 

3 0 2 mM DTT) . Cells were then resuspended in enough IB to 

fill a 1.7 ml tube containing 700 fxl of 0.5 fim glass 
beads and yeast microsomes were isolated from the cells 
essentially as described in Tillman, T. and Bell, R., J. 
Biol. Chem. 261:9144-9149 (1986). The microsomal 
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membrane pellet was recovered by centrif ugation at 
252,000 xg for 60 min. The pellet was rinsed by 
resuspending in 4 0 ml fresh IB and again recovered by 
centrif ugation at 252 000 Xg for 6 0 min. Microsomal 
5 pellets were resuspended in a minimal volume of IB, and 
the protein concentration adjusted to 2.5 fig fil~ l by 
addition of IB containing 15% glycerol. Microsomes were 
frozen on dry ice and stored at -8 0°C. The protein 
concentration in microsomes was determined by the 

10 Bradford method (Bradford, 1976) . 

Fatty acid elongase activity was measured 
essentially as described in Hlousek-Rado j cic , A. et al . , 
Plant J. 8:803-809 (1995). Briefly, the standard 
elongation reaction mix contained 80 mM Hepes-KOH, pH 

15 7.2, 20 mM MgCl 2 , 500 fxM NADPH, 1 mM ATP, 100 uM malonyl- 
CoA, 10 [iM CoA-SH and 15 /iM radioactive acyl-CoA 
substrate. The radiolabeled substrate was either [1~ 
14 C]18:l-CoA (50 uCi /xmol" 1 ) , [1' 14 C] 18 : 0 -CoA (55 uCi /zraol" 
x ) , or [1- 14 C] 16:0-CoA (54 uCi fimol' 1 ) . The reaction was 

2 0 initiated by the addition of yeast microsomes (5 fig 
protein) and the mixture incubated at 30° C for the 
indicated period of time. The final reaction volume was 
25 [Ml. 

Methyl esters of the acyl-CoA elongation products 
25 were prepared as described in Example 1. Methyl esters 
were separated on reversed phase silica gel KC18 TLC 
plates (Whatman, 250 uM thick) , quantified by 
phosphor imaging, and analyzed on by ImageQuant software 
(Molecular Dynamics, Inc., Sunnyvale, CA) . The detection 
30 limit for each product is about 0.001 nanomoles per min. 
per mg microsomal protein, depending on the phosphorimage 
exposure time. 

Results of representative in vitro elongation 
assays are shown in Figs. 1 and 2. The results indicate 
35 that microsomes from galactose- induced cells expressing 
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FAE1 catalyzed multiple cycles of elongation starting 
with either C16:0 acyl CoA, C18:0 acyl CoA, or C18:l 
acyl-CoA as the substrate (Fig. 1). The 16:0 and 18:0 
acyl-CoA substrates were elongated to C26:0 acyl-CoA. In 
5 contrast, the 18:1 -CoA substrate was elongated primarily 
to 020:1, with only low levels of C22:l acyl -CoA being 
produced. Occasionally, trace levels of C24:l CoA were 
also observed. Although the chain length of the products 
from the 18:1 acyl-CoA substrate were less than the chain 

10 length from the saturated acyl-CoA substrates, the rate 
of elongation of oleoyl-CoA was about 2- and 3- fold 
higher than the rates of elongation of 16:0-CoA and 18:0- 
CoA, respectively. 

The elongation activity observed in microsomes 

15 from uninduced cells indicated that there was a low level 
of endogenous elongase activity when 18:l-CoA or 18:0-CoA 
were used as substrates. There was substantial 16:0-CoA 
elongase activity (10.1 nmol mg protein" 1 at 30 min) in 
microsomes from uninduced cells (Fig. 2) . However, the 

20 major product of 16:0 elongation using uninduced 

microsomes was C18:0 acyl CoA, with only small amounts of 
products beyond this length. The elongation of the 16:0 
acyl -CoA substrate presumably is due to an endogenous 
yeast elongase. 

25 Elongation of 18:1 CoA by microsomes from induced 

cells occurred at a rate about 18 -fold higher than in 
microsomes isolated from the uninduced cells (Fig. 2) . 
With microsomes from induced yeast, synthesis of 20:0 CoA 
from the 16:0 CoA substrate, occurred at a rate similar 

30 to that seen when the substrate was 18:0 CoA (4.2 vs. 5.1 
nmol mg protein" 1 ) . The total rate of elongation of [ 14 C] 
16:0-CoA by microsomes from induced cells (15.8 nmol mg 
protein" 1 at 30 min.) was more than 50% higher than 
elongation of [ 14 C] 16:0-CoA by microsomes from uninduced 

35 cells, suggesting that the FAE1 KAS utilized 16:0-CoA as 
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a substrate in addition to C18-C24 acyl-CoAs. The FAE1 
elongase KAS activity, i.e., the difference in the 16:0 
elongation rates between microsomes from induced and 
uninduced cells, was 5.7 nmol mg protein" 1 . The 
5 elongation rate with the 16:0 substrate thus was similar 
to the elongase activity of the FAE1 elongase KAS with 
the 18:0 substrate. 

These results indicate that FAE1 KAS expressed in- 
yeast could synthesize 3 -ketoacyl -CoA in vitro and, in 
10 combination with yeast reductases and dehydrases, could 
form a functional VLCFA elongase complex. In addition, 
these results suggest that FAE1 is membrane -bound in 
yeast cells . 

Example 3 

15 Cloning and Sequencing of Arabidopsis Elongase Genes 

The sequence of a jojoba seed cDNA (see WO 
93/10241 and WO 95/15387, incorporated herein by 
reference) was used to search the Arabidopsis expressed 
sequence tag (EST) database of the Arabidopsis Genome 

2 0 Stock Center (The Ohio State University, Columbus, Ohio) . 

The BLAST computer program (National Institutes of 
Health, Bethesda, MD, USA) was used to perform the 
search. The search identified two ESTs (ATTS12 82 and 
ATTS3218) that had a high degree of sequence identity 
25 with the jojoba sequence. The ATTS1282 and ATTS3218 ESTs 
appeared to be partial cDNA clones rather than full- 
length clones based on the length of the jojoba sequence. 

A genomic DNA library from Arabidopsis tha.lia.na. 
cv. Columbia, was prepared in the lambda GEM11 vector 

3 0 (Promega, Madison, Wisconsin) and was obtained from Ron 

Davis, Stanford University, Stanford, CA. The library 
was hybridized with ATTS12 82 and ATTS3 218 as probes and 2 
clones were identified for each EST. Phage DNA was 
isolated from each of the hybridizing clones, the genomic 
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insert was excised with the restriction enzyme Sac I and 
subcloned into the plasmid pBluescript (Stratagene, La 
Jolla, CA) . One clone from the ATTS1282 hybridization 
was designated ELI and one clone from the ATTS3218 
5 hybridization was designated EL2 . 

A yeast expression library, containing cDNA from 
Arabidopsis thaliana cv. Columbia, was prepared in the 
lambda YES expression vector described in Elledge et al . 
(El ledge, S. et al . , Proc . Natl. Acad. Sci USA 88:1731- 

10 1735 (1991) and was obtained from Ron Davis at Stanford 
University, Stanford, CA. The library was hybridized 
with a EL2 partial cDNA probe. A full-length EL2 cDNA 
was not identified. However, the probe did identify a 
full-length cDNA which was designated EL3 . 

15 A consensus sequence for the C- terminal region of 

ELI, EL2 and the jojoba cDNA polypeptides was identified 
by sequence alignment using DNA analysis programs from 
DNAStar, Madison, Wisconsin. This consensus sequence was 
used to search the Arabidopsis EST database again for S- 

20 keto acyl synthase sequences. These searches identified 
four additional putative S-keto acyl synthase ESTs, which 
were designated EL4 through EL7 . EL4 , EL5, EL6 , and EL7 
have homology to Genbank Accession Nos . T04345, T44939, 
T22193 and T76700, respectively. 

2 5 The lambda YES cDNA expression library described 

above was hybridized with the ELI and EL4-EL7 ESTs as 
probes. This screen identified full-length cDNAs for 
ELI, ELS and EL6 . 

The lambda GEM11 genomic library was hybridized 

3 0 with the EL4 and EL7 ESTs as probes. This screen 

identified full-length genomic clones for EL4 and EL7 . 
Phage DNA was isolated from each of the hybridizing 
clones and subcloned into pBluescript as described above. 
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The 7 EL clones were sequenced on both strands 
with regions of overlap for each sequence run. 
Sequencing was carried out with an ABI automated 
sequencer (Applied Biosystems , Inc., Foster City, 
5 California), following the manufacturer's instructions. 

The nucleotide sequences for the coding regions of 
EL1-EL7 are shown in Figs. 3, 5, 7, 9, 11, 13 and 15, 
respectively. The deduced amino acid sequences for EL1- 
EL7 are shown in Figs. 4, 6, 8, 10, 12, 14 and 16, 

10 respectively, using the standard one- letter amino acid 
code. The ELI, EL2 and EL7 genomic clones appeared to 
lack introns . The EL4 genomic clone contained one intron 
near the 5' end of the coding region. 

The nucleotide sequences of the 7 EL 

15 polynucleotides were compared to 5 DNA sequences present 
in Genbank. Genbank, National Center for Biotechnology 
Information, Bethesda, MD. Two of the 5 accessions were 
cloned from members of the Brassicaceae : the Arabidopsis 
FAE1 sequence (Accession U29142) and a Bx-assica. napus 

20 sequence (Accession U50771) . Three of the accessions 
were cloned from jojoba {Siirnnondsia chinensis) : 2 wax 
biosynthesis genes (Accessions 114084 and 114085) and a 
jojoba KAS gene (Accession U37088) . See also U.S. Patent 
5,44 5,947, incorporated herein by reference. 

25 Multiple alignment of the 12 sequences was carried 

out with a computer program sold under the trade name 
MEGALIGN Lasergene by DNAStar (Madison, Wisconsin) . 
Alignments were done using the Clustal method with 
weighted residue weight table. The nucleotide sequence 

3 0 similarity index and percent divergence based on the 
multiple alignment algorithm is shown in Table 1. The 
nucleotide sequences of EL1-EL7 are distinguishable from 
the 5 DNA sequences obtained from Genbank. 

The deduced amino acid sequences of the ELI -7 

3 5 polypeptides were compared with the MEGALIGN program to 
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the deduced amino acid sequences of the same 5 Genbank 
clones, using the Clustal method with PAM250 residue 
weight table. The amino acid sequence similarity and 
percent divergence are shown in Table 2 . The amino acid 
5 sequences of EL1-EL7 polypeptides are distinguishable 
from those of the Genbank sequences. 
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Example 4 
Expression of ELI and EL2 in Yeast 

The open reading frames (ORFs) for the EL2 , EL4 
and EL7 clones were amplified by PCR. The EL2 ORF was 
cloned into XYES using the primers: 

CTCGAGCAAGTCCACTACCACGCA and CTCGAGCGAGTCAGAAGGAACAAA . 
The EL4 ORF was cloned into pYEUra3 using the primers: 
GATAATTTAGAGAGGCACAGGGT and GTCGACACAAGAATGGGTAGATCCAA . 
The EL 7 ORF was cloned into pYEUra3 using the primers: 
CAGTTCCTCAAACGAAGCTA and GTCGACTTCTCAATGGACGGTGCCGGA. 
Amplified products were cloned into pYEUra3 under the 
control of, and 3' to, the GAL1 promoter. The resulting 
plasmids were transformed into yeast as described in 
Example 1 . 

Yeast cultures containing full-length ELI in XYES 
and full-length EL2 in pYEUra3 were grown in the presence 
of galactose or glucose as described in Example 2 . 
Microsomes were then prepared from each of the cultures 
and fatty acid elongation assays were carried out as 
described in Example 2 . 

In the first experiment, microsomes were prepared 
from galactose- induced cultures of ELI, EL2 and FAE1, and 
incubated with either [1- 14 C] 18:0 acyl-CoA or [1- 14 C] 18:1 
acyl-CoA as substrate. The amounts of various reaction 
products synthesized after 3 0 minutes (min) were 
determined as described in Example 2 . The results when 
18:0 acyl-CoA was the substrate are shown in Table 3. 
The results when 18:1 acyl-CoA was the substrate are 
shown in Table 4 . 
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The results shown in Tables 3 and 4 indicate that 
the ELI and EL2 gene products have S-ketoacyl synthase 
(KAS) activity and that the KAS reaction product can be 
utilized to form VLCFAs . The specific activities of the 
3 KAS enzymes cannot be compared, since the relative 
amount of the heterologous KAS protein in each microsomal 
preparation is not known. However, the proportions of 
various reaction products can be compared between FAE1 , 
ELI and EL2 . 

The data shown in Table 3 indicate that the ELI 
and EL2 KAS activities result in a higher proportion of 
saturated VLCFAs than does the FAE1 KAS activity. These 
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results suggest that ELI and EL2 encode novel gene 
products, because ELI and EL2 have a greater preference 
for C22:0 and C24 : 0 acyl-CoA substrates than does FAE1 . 

A comparison of the relative elongation activity 
of FAE1 with 18:0 and 18:1 substrates (Tables 3 and 4) 
indicates that FAE1 is more active when 18:1 is the 
substrate than when 18:0 is the substrate. In contrast, 
the overall rate of product formation with ELI is less 
when 18:1 is the substrate than when 18:0 is the 
substrate (Tables 3 and 4) . EL2 is also less active when 
18:1 is the substrate than when 18:0 is the substrate 
(Tables 3 and 4) . These results support the conclusion 
that ELI and EL2 encode novel gene products and suggest 
that ELI and EL2 have a preference for saturated fatty 
acids as substrates, whereas the FAE1 gene product has a 
preference for monounsaturated fatty acids as substrates. 

In a second experiment, microsomes were prepared 
from galactose-induced and from glucose-repressed yeast 
cultures containing ELI or EL2 coding sequences. The 
microsomal preparations were incubated with either 18:0 
acyl-CoA or 18:1 acyl-CoA as substrate and the fatty acid 
reaction products determined as described above. The 
results with the 18:0 substrate are shown in Table 5. 
The results with the 18:1 substrate are shown in Table 6. 
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Table 5. 

Elongation of 18:0-CoA by ELI and EL 2 
With and Without Induction of Gene Expres s ion 





fi-Keto Acyl S3 


mthase Gene 


Acyl 
CoA 


ELI 


EL 2 


+Glucoae 


♦Galactose 


♦Glucose 


♦Galactose 


Rate 1 


(%) 


Rate 


(%) 


Rate 


(%) 


Rate 


<%> 


20:0 


0.007 


100.0 


0.074 


55.8 


0.030 


81 .3 


0.107 


43 . 1 


22:0 


0 .000 


0.0 


0.023 


17.4 


0.002 


5.1 


0. 044 


17 . 8 


24 :0 


0.000 


0.0 


0.020 


15.3 


0.005 


13 .6 


0.048 


19.1 


26:0 


0 ,000 


0,0 


0.015 


11.5 


0 .000 


0.0 


0.050 


20.0 


Total 


0 .007 


100.0 


0.133 


100. 0 


0.037 


100 .0 


0.249 


100. 0 



1 Nanonroles /minute /mg of microsomal protein 



Table 6. 

Elongation of 18:l-CoA by ELI and EL2 
With and Without Induction of Gene Expression 





fi-Keto Acyl . 


Synthase Gene 


Acyl 
COA 


ELI 


EL2 


♦Glucose 


♦Galactose 


♦Glucose 


♦Galactose 


Rate 1 


(%) 


Rate 


<%) 


Rate 


(%) 


Rate 


(%) 


20il 


0 .062 


100.0 


0.081 


100.0 


0.043 


100.0 


0.089 


100.0 


22 :1 


0. 000 


0.0 


0.000 


0.0 


0.000 


0.0 


0.000 


0.0 


24: 1 


0 . 000 


0.0 


0 . 000 


0.0 


0.000 


0.0 


0.000 


0.0 


26:1 


0.000 


0.0 


0.000 


0.0 


0 .000 


0.0 


0.000 


0.0 


Total 


0.062 


100.0 


0.081 


100.0 


0 . 04 3 


100. 0 


0.089 


100.0 



Nanomolea /minute /mg of microsomal protein 



The results in Table 5 show in vitro elongase 
activity for ELI and EL2 under induced (galactose) and 
uninduced (glucose) conditions. The comparison indicates 
that induction with galactose results in a large increase 
in overall elongase activity when 18:0 acyl CoA is the 
substrate (about 19 -fold and 7 -fold for ELI and EL2 , 
respectively). In contrast, induction when 18:1 acyl CoA 
is the substrate results in only a small increase in 
elongase activity (about 1.3-fold and 2-fold for ELI and 
E12, respectively), as shown in Table 6. 

The results in Table 5 show that little or no 
VLCFA products are made by yeast microsomes under 
uninduced conditions. Up>on induction of ELI and EL2 gene 
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expression, however, significant quantities of C20 : 0 , 
C22:0, C24:0 and C26:0 are made. The data in Tables 5 
and 6 are consistent with the results in Tables 3 and 4, 
which indicated that ELI and EL2 were more active with a 
saturated fatty acid substrate than with a 
monounsaturated substrate . 

The data in Tables 5 and 6 are also consistent 
with the data in Tables 3 and 4 indicating that the ELI 
and EL2 gene products are more active in converting C24 : 0 
to C26:0 than is FAE1 . 

In a third experiment, microsomes from induced and 
uninduced cultures containing ELI or EL2 were incubated 
in the absence of cofactors involved in the fi-ketoacyl 
condensation reaction . Cultures were induced and 
microsomes were prepared as described in Example 2 . In 
vitro assays were carried out as described in Example 2, 
except that either ATP, CoASH or both were omitted from 
the enzyme reaction mixture. In addition, one reaction 
was carried out in a complete mixture having 0.01 mM of 
cerulenin (Sigma, St. Louis, MO) . Cerulenin is an 
inhibitor of some condensing enzymes . The results are 
shown in Tables 7-9. 
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Table 7. 

Effect of Cof actors on 18:0 -CoA Elongation 1 



Gene 


Expt 4 


+Glu 2 


+Gal 2 


-ATP 3 


-CoA 3 


-A&C 3 


+ Cer 3 


ELI 


1 


. 037 


. 109 


. 095 


.105 


. 119 


. 141 


2 


N.D. 


. 090 


.125 


. 093 


.270 


.176 


EL2 


1 


. 033 


. 112 


. 168 


. 127 


.143 


.238 


2 


N.D. 


.120 


. 178 


.133 


.195 


.302 



1 Activity in nanomoles /minute /mg of microsomal protein. 

2 +Glu*. microsomes from cultures grown in the presence of glucose and 
incubated in standard reaction mix; +Gal : microsomes from cultures grown 
in the presence of galactose and incubated in standard reaction mix. 

3 Microsomes from galactose- induced cultures. -ATP; ATP omitted from 
reaction mix; -CoA: Coenzyme A omitted from reaction mix; -A&C: ATP and 
Coenzyme A omitted from reaction mix; +Cer: Standard reaction mix 
containing 0.01 mM cerulenin. 

4 Experiment No. 
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Table 8. 

Effect of Cofactors on Elongation Products of ELI 1 



Prod. 


+Glu 2 


+Gal 2 


-ATP 3 


-CoA 3 


-A&C 3 


+Cer 3 


20 : 0 


53 . 9 


46 .2 


34 .4 


47 .8 


41.7 


46 . 7 


22 : 0 


14.4 


18 . 7 


13 . 7 


18 . 0 


19.4 


16.2 


24 : 0 


18.5 


18 . 1 


20 . 6 


19.1 


16 . 7 


17. 7 


26 : 0 


13 .2 


17 . 1 


31.4 


15.2 


22 .3 


19.4 


Total 


100.0 


100.0 


100 . 0 


100 . 0 


100 . 0 


100 . 0 



Amount of indicated product as a percent of total products formed. 
Results of one experiment for +Glucose; Average of two experiments for 
other conditions. 

2 +Glu: microsomes from cultures grown in the presence of glucose and 
incubated in standard reaction mix; +Gal : microsomes from cultures grown 
in the presence of galactose and incubated in standard reaction mix. 

3 Microsomes from galactose-induced cultures. -ATP: ATP omitted from 
reaction mix; -CoA: Coenzyme A omitted from reaction mix; -A&C: ATP and 
Coenzyme A omitted from reaction mix; +Cer : Standard reaction mix 
containing 0.01 mM cerulenin. 
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Table 9. 

Effect of Cof actors on Elongation Products of EL2 1 



Prod. 


+Glu 2 


+Gal 2 


-ATP 3 


-CoA 3 


-A&C 3 


+Cer 3 


20:0 


54 .5 


47. 1 


34 .1 


45 .3 


38. 0 


41 . 8 


22 :0 


17 . 1 


19. 1 


16.4 


19 .2 


15. 9 


16.1 


24 :0 


5.8 


19.4 


20.8 


19 .9 


18.4 


20.4 


26 : 0 


22 . 6 


14 . 5 


28.9 


15 . 8 


27. 8 


21 . 8 


Total 


100 . 0 


100 . 0 


100 . 0 


100. 0 


100 . 0 


100 .0 



Amount of indicated product as a percent of total products formed. 
Results of one experiment for -i-Glucose; Average of two experiments for 
other conditions . 

2 +Glu: microsomes from cultures grown in the presence of glucose and 
incubated in standard reaction mix; +Gal ; microsomes from cultures grown 
in the presence of galactose and incubated in standard reaction mix. 

3 Microsomes from galactose- induced cultures. -ATP: ATP omitted from 
reaction mix; -CoA: Coenzyme A omitted from reaction mix; -A&C: ATP and 
Coenzyme A omitted from reaction mix; +Cer : Standard reaction mix 
containing 0.01 mM cerulenin. 

The results in Table 7 indicate that omission of 
ATP and/or CoA from the incubation mixture does not have 
a significant effect on the overall amounts of VLCFAs 
synthesized by the in vitro KAS activity of ELI or EL2 . 
The results also show that cerulenin does not inhibit the 
KAS activity of ELI or EL2 . The data in Table 8 and 9 
confirm that ELI and EL2 KAS activity produces 
significant amounts of C24:0 and C26:0 acyl CoA products. 

To the extent not already indicated, it will be 
understood by those of ordinary skill in the art that any 
one of the various specific embodiments herein described 
and illustrated may be further modified to incorporate 
features shown in other of the specific embodiments. 

The foregoing detailed description has been 
provided for a better understanding of the invention only 
and no unnecessary limitation should be understood 
therefrom as some modifications will be apparent to those 
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skilled in the art without deviating from the spirit and 
scope of the appended claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 
( i ) APPLICANT : CARGILL , INCORPORATED 
(ii) TITLE OF THE INVENTION: FATTY ACID ELONGASES 



(iii) NUMBER OF SEQUENCES: 14 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Richardson P.C., P. A. 

(B) STREET: 60 South Sixth Street, Suite 3300 

(C) CITY: Minneapolis 

(D) STATE: MN 
<E) COUNTRY: USA 
(F) ZIP: 55402 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

( C ) CLASSIFICATION : 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/868,373 

(B) FILING DATE: 03-JUN-1997 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Lundquist, Ronald C 

(B) REGISTRATION NUMBER: 37,875 

(C) REFERENCE/DOCKET NUMBER: 07039/064WO1 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 612-335-5050 

<B) TELEFAX: 612-288-9696 
(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1560 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATGGATCGAG AGAGATTAAC GGCGGAGATG GCGTTTCGAG ATTCATCATC GGCCGTTATA 60 

AGAATTCGAA GACGTTTGCC GGATTTATTA ACGTCCGTTA AGCTCAAATA CGTGAAGCTT 120 

GGACTTCACA ACTCTTGCAA CGTGACCACC ATTCTCTTCT TCTTAATTAT TCTTCCTTTA 18 0 

ACCGGAACCG TGCTGGTTCA GCTAACCGGT CTAACGTTCG ATACGTTCTC TGAGCTTTGG 24 0 

TCTAACCAGG CGGTTCAACT CGACACGGCG ACGAGACTTA CCTGCTTGGT TTTCCTCTCC 3 00 

TTCGTTTTGA CCCTCTACGT GGCTAACCGG TCTAAACCGG TTTACCTAGT GGATTTCTCC 3 60 

TGCTACAAAC CGGAAGACGA GCGTAAAATA TCAGTAGATT CGTTCTTGAC GATGACTGAG 42 0 

GAAAATGGAT CATTCACCGA TGACACGGTT CAGTTCCAGC AAAGAATCTC GAACCGGGCC 4 80 
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GGTTTGGGAG ACGAGACGTA TCTGCCACGT GGCATAACTT CAACGCCCCC GAAGCTAAAT 54 0 

ATGTCAGAGG CACGTGCCGA AGCTGAAGCC GTTATGTTTG GAGCCTTAGA TTCCCTCTTC 60 0 

GAGAAAACCG GAATTAAACC GGCCGAAGTC GGAATCTTGA TAGTAAACTG CAGCTTATTC 660 

AATCCGACGC CGTCTCTATC AGCGATGATC GTGAACCATT ACAAGATGAG AGAAGACATC 72 0 

AAAAGTTACA ACCTCGGAGG AATGGGTTGC TCCGCCGGAT TAATCTCAAT CGATCTCGCT 780 

AACAATCTCC TCAAAGCAAA CCCTAATTCT TACGCTGTCG TGGTAAGCAC GGAAAACATA 84 0 

ACCCTAAACT GGTACTTCGG AAATGACCGG TCAATGCTCC TCTGCAACTG CATCTTCCGA 90 0 

ATGGGCGGAG CTGCGATTCT CCTCTCTAAC CGCCGTCAAG ACCGGAAGAA GTCAAAGTAC 960 

TCGCTGGTCA ACGTCGTTCG AACACATAAA GGATCAGACG ACAAGAACTA CAATTGCGTG 102 0 

TACCAGAAGG AAGACGAGAG AGGAACAATC GGTGTCTCTT TAGCTAGAGA GCTCATGTCT 1080 

GTCGCCGGAG ACGCTCTGAA AACAAACATC ACGACTTTAG GACCGATGGT TCTTCCATTG 114 0 

TCAGAGCAGT TGATGTTCTT GATTTCCTTG GTCAAAAGGA AGATGTTCAA GTTAAAAGTT 120 0 

AAACCGTATA TTCCGGATTT CAAGCTAGCT TTCGAGCATT TCTGTATTCA CGCAGGAGGT 12 6 0 

AGAGCGGTTC TAGACGAAGT GCAGAAGAAT CTTGATCTCA AAGATTGGCA CATGGAACCT 132 0 

TCTAGAATGA CTTTGCACAG ATTTGGTAAC ACTTCGAGTA GCTCGCTTTG GTATGAGATG 13 80 

GCTTATACCG AAGCTAAGGG TCGGGTTAAA GCTGGTGACC GACTTTGGCA GATTGCGTTT 144 0 

GGATCGGGTT TCAAGTGTAA TAGTGCGGTT TGGAAAGCGT TACGACCGGT TTCGACGGAG 150 0 

GAGATGACCG GTAATGCTTG GGCTGGTTCG ATTGATCAAT ATCCGGTTAA AGTTGTGCAA 156 0 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
(D> TOPOLOGY : linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



Met 


Asp 


Arg 


Glu 


Arg 


Leu 


Thr 


Ala 


Glu 


Met 


Ala 


Phe 


Ar 9 


Asp 


Ser 


Ser 


1 








5 










10 










15 




Ser 


Ala 


Val 


He 
20 


Arg 


He 


Arg 


Arg 


Arg 
25 


Leu 


Pro 


Asp 


Leu 


Leu 
30 


Thr 


Ser 


Val 


Lys 


Leu 
35 


Lys 


Tyr 


Val 


Lys 


Leu 
40 


Gly 


Leu 


His 


Asn 


Ser 
45 


Cys 


Asn 


Val 


Thr 


Thr 


He 


Leu 


Phe 


Phe 


Leu 


He 


He 


Leu 


Pro 


Leu 


Thr 


Gly 


Thr 


Val 




50 










55 










60 








Leu 


Val 


Gin 


Leu 


Thr 


Gly 


Leu 


Thr 


Phe 


Asp 


Thr 


Phe 


Ser 


Glu 


Leu 


Trp 


65 










70 








75 










80 


Ser 


Asn 


Gin 


Ala 


Val 
85 


Gin 


Leu 


Asp 


Thr 


Ala 
90 


Thr 


Arg 


Leu 


Thr 


Cys 
95 


Leu 


Val 


Phe 


Leu 


Ser 


Phe 


Val 


Leu 


Thr 


Leu 


Tyr 


Val 


Ala 


Asn 


Arg 


Ser 


Lys 








100 










105 








110 






Pro 


Val 


Tyr 
115 


Leu 


Val 


Asp 


Phe 


Ser 
12 0 


Cys 


Tyr 


Lys 


Pro 


Glu 
125 


Asp 


Glu 


Arg 


Lys 


He 


Ser 


Val 


Asp 


Ser 


Phe 


Leu 


Thr 


Met 


Thr 


Glu 


Glu 


Asn 


Gly 


Ser 


130 








135 










140 










Phe 


Thr 


Asp 


Asp 


Thr 


Val 


Gin 


Phe 


Gin 


Gin 


Arg 


He 


Ser 


Asn 


Arg 


Ala 


145 










150 










155 










160 


Gly 


Leu 


Gly 


Asp 


Glu 
165 


Thr 


Tyr 


Leu 


Pro 


Arg 
170 


Gly 


He 


Thr 


Ser 


Thr 
175 


Pro 


Pro 


Lys 


Leu 


Asn 
180 


Met 


Ser 


Glu 


Ala 


Arg 
185 


Ala 


Glu 


Ala 


Glu 


Ala 
190 


Val 


Met 


Phe 


Gly 


Ala 


Leu 


Asp 


Ser 


Leu 


Phe 


Glu 


Lys 


Thr 


Gly 


He 


Lys 


Pro 


Ala 






195 










200 








205 








Glu 


Val 
210 


Gly 


He 


Leu 


He 


Val 
215 


Asn 


Cys 


Ser 


Leu 


Phe 
220 


Asn 


Pro 


Thr 


Pro 


Ser 


Leu 


Ser 


Ala 


Met 


He 


Val 


Asn 


His 


Tyr 


Lys 


Met 


Arg 


Glu 


Asp 


He 


225 










230 










235 










240 


Lys 


Ser 


Tyr 


Asn 


Leu 
245 


Gly 


Gly 


Met 


Gly 


Cys 
250 


Ser 


Ala 


Gly 


Leu 


He 
255 


Ser 


lie 


Asp 


Leu 


Ala 


Asn 


Asn 


Leu 


Leu 


Lys 


Ala 


Asn 


Pro 


Asn 


Ser 


Tyr 


Ala 








260 










265 










270 




Val 


Val 


Val 


Ser 


Thr 


Glu 


Asn 


He 


Thr 


Leu 


Asn 


Trp 


Tyr 


Phe 


Gly 


Asn 



275 280 285 
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Asp 


Arg 


Ser 


Met 


Leu 


Leu 


Cys 


Asn 


Cys 


lie 


Phe 


Arg 


Met 


Gly 


Gly 


Ala 




290 










295 










300 










Ala 


lie 


Leu 


Leu 


Ser 


Asn 


Arg 


Arg 


Gin 


Asp 


Arg 


Lys 


Lys 


Ser 


Lys 


Tyr 


305 










310 










315 










320 


Ser 


Leu 


Val 


Asn 


Val 


Val 


Arg 


Thr 


His 


Lys 


Gly 


Ser 


Asp 


Asp 


Lys 


Asn 










325 










330 










335 




Tyr 


Asn 


Cys 


Val 


Tyr 


Gin 


Lys 


Glu 


Asp 


Glu 


Arg 


Gly 


Thr 


lie 


Gly 


Val 








340 










345 










350 






Ser 


Leu 


Ala 


Arg 


Glu 


Leu 


Met 


Ser 


Val 


Ala 


Gly Asp 


Ala 


Leu 


Lys 


Thr 






355 










360 










365 






Asn 


lie 


Thr 


Thr 


Leu 


Gly 


Pro 


Met 


Val 


Leu 


Pro 


Leu 


Ser 


Glu 


Gin 


Leu 




370 










375 










380 










Met 


Phe 


Leu 


lie 


Ser 


Leu 


Val 


Lys 


Arg 


Lys 


Met 


Phe 


Lys 


Leu 


Lys 


Val 


385 










3 90 










395 










400 


Lys 


Pro 


Tyr 


lie 


Pro 


Asp 


Phe 


Lys 


Leu 


Ala 


Phe 


Glu 


His 


Phe 


Cys 


He 










405 










410 










415 




His 


Ala 


Gly 


Gly 


Arg 


Ala 


Val 


Leu 


Asp 


Glu 


Val 


Gin 


Lys 


Asn 


Leu 


Asp 








420 










425 










430 






Leu 


Lys 


Asp 


Trp 


His 


Met 


Glu 


Pro 


Ser 


Arg 


Met 


Thr 


Leu 


His 


Arg 


Phe 






A "3 C 










44 0 










445 








Gly 


Asn 


Thr 


Ser 


Ser 


Ser 


Ser 


Leu 


Trp 


Tyr 


Glu 


Met 


Ala 


Tyr 


Thr 


Glu 




450 










455 










460 










Ala 


Lys 


Gly 


Arg 


Val 


Lys 


Ala 


Gly 


Asp 


Arg 


Leu 


Trp 


Gin 


lie 


Ala 


Phe 


465 










470 










475 










480 


Gly 


Ser 


Gly 


Phe 


Lys 


Cys 


Asn 


Ser 


Ala 


Val 


Trp 


Lys 


Ala 


Leu 


Arg 


Pro 










485 










490 










495 




Val 


Ser 


Thr 


Glu 


Glu 


Met 


Thr 


Gly 


Asn 


Ala 


Trp 


Ala 


Gly 


Ser 


He 


Asp 








500 










505 










510 






Gin 


Tyr 


Pro 


Val 


Lys 


Val 


Val 


Gin 



















515 520 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1479 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATGGATTACC CCATGAAGAA GGTAAAAATC TTTTTCAACT ACCTCATGGC GCATCGCTTC 60 

AAGCTCTGCT TCTTACCATT AATGGTTGCT ATAGCCGTGG AGGCGTCTCG TCTTTCCACA 120 

CAAGATCTCC AAAACTTTTA CCTCTACTTA CAAAACAACC ACACATCTCT AACCATGTTC 180 

TTCCTTTACC TCGCTCTCGG GTCGACTCTT TACCTCATGA CCCGGCCCAA ACCCGTTTAT 24 0 

CTCGTTGACT TTAGCTGCTA CCTCCCACCG TCGCATCTCA AAGCCAGCAC CCAGAGGATC 300 

ATGCAACACG TAAGGCTTGT ACGAGAAGCA GGCGCGTGGA AGCAAGAGTC CGATTACTTG 360 

ATGGACTTCT GCGAGAAGAT TCTAGAACGT TCCGGTCTAG GCCAAGAGAC GTACGTACCC 42 0 

GAAGGTCTTC AAACTTTGCC ACTACAACAG AATT TGGCTG TATCACGTAT AGAGACGGAG 4 80 

GAAGTTATTA TTGGTGCGGT CGATAATCTG TTTCGCAACA CGGGAATAAG CCCTAGTGAT 540 

ATAGGTATAT TGGTGGTGAA TTCAAGCACT TTTAATCCAA CACCTTCGCT ATCAAGT AT C 60 0 

TTAGTGAATA AGTTTAAACT TAGGGATAAT ATAAAGAGCT TGAATCTTGG TGGGATGGGG 66 0 

TGTAGCGCTG GAGTCATCGC TATCGATGCG GCTAAGAGCT TGTTACAAGT TCATAGAAAC 72 0 

ACTTATGCTC TTGTGGTGAG CACGGAGAAC ATCACTCAAA ACTTGTACAT GGGTAACAAC 780 

AAATCAATGT TGGTTACAAA CTGTTTGTTC CGTATAGGTG GGGCCGCGAT TTTGCTTTCT 84 0 

AACCGGTCTA TAGATCGTAA ACGCGCAAAA TACGAGCTTG TTCACACCGT GCGGGTCCAT 900 

AC CGGAGC AG ATGACCGATC CTATGAATGT GCAACTCAAG AAGAGGATGA AGATGGCATA 960 

GTTGGGGTTT CCTTGTCAAA GAATCTAC C A ATGGTAGCTG CAAGAACCCT AAAGATCAAT 102 0 

ATCGCAACTT TGGGTCCGCT TGTTCTTCCC ATAAGCGAGA AGTTTCACTT CTTTGTGAGG 1080 

TTCGTTAAAA AGAAGTTTCT CAACCCCAAG CTAAAGCATT ACATTCCGGA TTTCAAGCTC 114 0 

GCATTCGAGC ATTTCTGTAT CCATGCGGGT GGTAGAGCGC TAATTGATGA GATGGAGAAG 12 00 

AATCTTCATC TAACTCCACT AGACGTTGAG GCTTCAAGAA TGACATTACA CAGGTTTGGT 1260 

AATACCTCTT CGAGCTCCAT TTGGTACGAG TTGGCTTACA CAGAAGC CAA AGGAAGGATG 132 0 

ACGAAAGGAG ATAGGATTTG GCAGATTGCG TTGGGGTCAG GTTTTAAGTG TAATAGTTCA 13 80 
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GTTTGGGTGG CTCTTCGTAA CGTCAAGCCT TCTACTAATA ATCCTTGGGA ACAGTGTCTA 144 0 
CACAAATATC CAGTTGAGAT CGATATAGAT TTAAAAGAG 1479 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 493 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1512 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

CTACGTCAGG GTAGAACAAA GAGTAAACAC • TTAAGCAAAA CAATTTGTCC TACTCTTAGG 60 

TTATCTCCAA TGAAGAACTT AAAGATGGTT TTCTTCAAGA TCCTCTTTAT CTCTTTAATG 12 0 

GCAGGATTAG CCATGAAAGG ATCTAAGATC AACGTAGAAG ATCTCCAAAA GTTCTCCCTC 180 

CACCATACAC AGAACAACCT CCAAACCATA AGCCTTCTAT TGTTTCTTGT CGTTTTTGTG 240 

TGGATCCTCT ACATGTTAAC CCGACCTAAA CCCGTTTACC TTGTTGATTT CTCCTGCTAC 30 0 

CTTCCACCGT CGCATCTCAA GGTCAGTATC CAAACCCTAA TGGGACACGC AAGACGTGCA 360 

AGAGAAGCAG GCATGTGTTG GAAGAACAAA GAGAGCGACC ATTTAGTTGA CTTCCAGGAG 420 

AAGATTCTTG AACGTTCCGG TCTTGGTCAA GAAAC CTAC A TCCCCGAGGG TCTTCAGTGC 480 

TTCCCACTTC AGCAAGGCAT GGGTGCTTCA CGTAAAGAGA CGGAAGAAGT AATCTTCGGA 54 0 

GCTCTTGACA ATCTTTTTCG CAACACCGGT GTAAAACCTG ATGATATCGG TATATTGGTG 600 

GTGAATTCTA GCACGTTTAA TCCAACTCCA TCACTCGCCT CCATGATTGT GAACAAGTAC 660 

AAACTCAGAG ACAACATCAA GAGTTTGAAT CTTGGAGGGA TGGGTTGCAG TGC CGGAGTT 72 0 

ATAGCTGTTG ATGTCGCTAA GGGATTACTA CAAGTTCATA GGAACACTTA TGCTATTGTA 780 

GTAAGCACAG AGAACATCAC TCAGAACTTA TACTTGGGGA AAAACAAATC AATGCTAGTC 84 0 

ACAAACTGTT TGTTCCGCGT TGGTGGTGCT GCGGTTCTGC TTTCAAACAG ATCTAGAGAC 900 

CGTAACCGCG CCAAATACGA GCTTGTTCAC ACCGTACGGA TCCATACCGG ATCAGATGAT 960 

AGGTCGTTCG AATGTGCGAC ACAAGAAGAG GATGAAGATG GTATAATTGG AGTTACCTTG 1020 

ACAAAGAATC TACCTATGGT GGCTGCAAGG ACTCTTAAGA TAAATATCGC AACTTTGGGT 1080 

CCTCTTGTAC TTCCATTAAA AGAGAAGCTA GCCTTCTTTA TTACTTTTGT CAAGAAGAAG 114 0 

TATTTCAAGC CAGAGTTAAG GAATTATACA CCAGATTTCA AGCTTGCCTT TGAGCATTTC 120 0 

TGTATCCACG CTGGTGGAAG AGCTCTAATA GATGAGCTGG AGAAGAACCT TAAGCTTTCT 1260 

CCGTTACACG TAGAGGCGTC AAGAATGACA CTACACAGGT TTGGTAACAC TTCTTCTAGC 132 0 

TCAATCTGGT ACGAGTTAGC TTATACAGAA GCTAAAGGAA GGATGAAGGA AGGAGATAGG 13 80 

ATTTGGCAGA TTGCTTTGGG GTCAGGTTTT AAGTGTAACA GTTCAGTATG GGTGGCTCTG 144 0 

CGAGACGTTA AGCCTTCAGC TAACAGTCCA TGGGAAGACT GTATGGATAG ATATC CGGTT 150 0 

GAGATTGATA TT 1512 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 504 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Leu Arg Gin Gly Arg Thr Lys Ser Lys His Leu Ser Lys Thr lie Cys 

15 10 15 

Pro Thr Leu Arg Leu Ser Pro Met Lys Asn Leu Lys Met Val Phe Phe 
20 25 30 



BNSDOCID: <WO 9854954A 1JA> 



WO 98/54954 



PCT/US98/11384 



Lys 


lie 


Leu 


Phe 


He 


Ser 


Leu 


Met 






35 










40 


Lys 


He 


Asn 


Val 


Glu 


Asp 


Leu 


Gin 




50 










55 




Asn 


Asn 


Leu 


Gin 


Thr 


He 


Ser 


Leu 


65 










70 






Trp 


He 


Leu 


Tyr 


Met 


Leu 


Thr 


Arg 










85 








Phe 


Ser 


Cys 


Tyr 


Leu 


Pro 


Pro 


Ser 








100 










Leu 


Met 


Gly 


His 


Ala 


Arg 


Arg 


Ala 






115 










120 


Asn 


Lys 


Glu 


Ser 


Asp 


His 


Leu 


Val 




130 










135 




Arg 


Ser 


Gly 


Leu 


Gly 


Gin 


Glu 


Thr 


145 










150 






Phe 


Pro 


Leu 


Gin 


Gin 


Gly 


Met 


Gly 










165 






Val 


He 


Phe 


Gly 


Ala 


Leu 


Asp 


Asn 








180 










Pro 


Asp 


Asp 


He 


Gly 


He 


Leu 


Val 






195 










200 


Thr 


Pro 


Ser 


Leu 


Ala 


Ser 


Met 


He 




210 










215 




Asn 


He 


Lys 


Ser 


Leu 


Asn 


Leu 


Gly 


225 










230 






lie 


Ala 


Val 


Asp 


Val 


Ala 


Lys 


Gly 










245 








Tyr 


Ala 


He 


Val 


Val 


Ser 


Thr 


Glu 








260 










Gly 


Lys 


Asn 


Lys 


Ser 


Met 


Leu 


Val 






275 










280 


Gly 


Ala 


Ala 


Val 


Leu 


Leu 


Ser 


Asn 




290 










295 




Lys 


Tyr 


Glu 


Leu 


Val 


His 


Thr 


Val 


305 










310 






Arg 


Ser 


Phe 


Glu 


Cys 


Ala 


Thr 


Gin 










325 








Gly Val 


Thr 


Leu 


Thr 


Lys 


Asn 


Leu 








340 










Lys 


He 


Asn 


He 


Ala 


Thr 


Leu 


Gly 






355 










360 


Lys 


Leu 


Ala 


Phe 


Phe 


He 


Thr 


Phe 




370 










375 




Glu 


Leu Arg 


Asn 


Tyr 


Thr 


Pro 


Asp 


385 










390 






Cys 


He 


His 


Ala 


Gly 


Gly 


Arg 


Ala 










405 








Leu 


Lys 


Leu 


Ser 


Pro 


Leu 


His 


Val 








42 0 










Arg 


Phe 


Gly 


Asn 


Thr 


Ser 


Ser 


Ser 






435 










440 


Thr 


Glu 


Ala 


Lys 


Gly 


Arg 


Met 


Lys 




450 










455 




Ala 


Leu 


Gly 


Ser 


Gly 


Phe 


Lys 


Cys 


465 










470 






Arg 


Asp 


Val 


Lys 


Pro 


Ser 


Ala 


Asn 










485 








Arg 


Tyr 


Pro 


Val 


Glu 


He 


Asp 


He 



500 



- 42 - 



Ala 


Gly 


Leu 


Ala 


Met 


Lys 


Gly 


Ser 










45 








Lys 


Phe 


Ser 


Leu 


His 


His 


Thr 


Gin 


Leu 


Leu 


Phe 


60 
Leu 


Val 


Val 


Phe 


Val 






75 










80 


Pro 


Lys 


Pro 


Val 


Tyr 


Leu 


Val 


Asp 




90 










95 




His 


Leu 


Lys 


Val 


Ser 


He 


Gin 


Thr 


105 










110 






Arg 


Glu 


Ala 


Gly 


Met 


Cys 


Trp 


Lys 










125 








Asp 


Phe 


Gin 


Glu 


Lys 


He 


Leu 


Glu 








140 










Tyr 


lie 


Pro 


Glu 


Gly 


Leu 


Gin 


Cys 






155 










160 


Ala 


Ser 


Arg 


Lys 


Glu 


Thr 


Glu 


Glu 




170 










175 




Leu 


Phe 


Arg 


Asn 


Thr 


Gly 


Val 


Lys 


185 










190 






Val 


Asn 


Ser 


Ser 


Thr 


Phe 


Asn 


Pro 










205 








Val 


Asn 


Lys 


Tyr 


Lys 


Leu 


Arg 


Asp 








220 










Gly 


Met 


Gly 


Cys 


Ser 


Ala 


Gly 


Val 






235 










240 


Leu 


Leu 


Gin 


Val 


His 


Arg 


Asn 


Thr 




250 










255 




Asn 


He 


Thr 


Gin 


Asn 


Leu 


Tyr 


Leu 


265 










270 




Thr 


Asn 


Cys 


Leu 


Phe 


Arg 


Val 


Gly 










285 








Arg 


Ser 


Arg 


Asp 


Arg 


Asn 


Arg 


Ala 








300 










Arg 


He 


His 


Thr 


Gly 


Ser 


Asp 


Asp 






315 










320 


Glu 


Glu 


Asp 


Glu 


Asp 


Gly 


lie 


He 




330 










335 




Pro 


Met 


Val 


Ala 


Ala 


Arg 


Thr 


Leu 


345 










350 






Pro 


Leu 


Val 


Leu 


Pro 


Leu 


Lys 


Glu 










365 






Val 


Lys 


Lys 


Lys 


Tyr 


Phe 


Lys 


Pro 








380 










Phe 


Lys 


Leu 


Ala 


Phe 


Glu 


His 


Phe 






395 










400 


Leu 


He 


Asp 


Glu 


Leu 


Glu 


Lys 


Asn 




410 










415 




Glu 


Ala 


Ser 


Arg 


Met 


Thr 


Leu 


His 


425 










430 






Ser 


He 


Trp 


Tyr 


Glu 


Leu 


Ala 


Tyr 










445 








Glu 


Gly 


Asp 


Arg 


He 


Trp 


Gin 


He 








460 










Asn 


Ser 


Ser 


Val 


Trp 


Val 


Ala 


Leu 






475 










480 


Ser 


Pro 


Trp 


Glu 


Asp 


Cys 


Met 


Asp 




490 










495 





BNSDOCID: <WO 9854954A1 JA> 



WO 98/54954 



PCT/US98/11384 



- 43 - 

(2) INFORMATION FOR SEQ ID NO : 7 : 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1650 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO; 7: 

ATGGGTAGAT CCAACGAGCA AGATCTGCTC TCTACCGAGA TCGTTAATCG TGGGATCGAA 60 

CCATCCGGTC CTAACGCCGG CTCACCAACG TTCTCGGTTA GGGTCAGGAG ACGTTTGCCT 12 0 

GATTTTCTTC AGTCGGTGAA CTTGAAGTAC GTGAAACTTG GTTAC C ACTA CCTCATAAAC 180 

CATGCGGTTT ATTTGGCGAC CATACCGGTT CTTGTGCTGG TTTTTAGTGC TGAGGTTGGG 240 

AGTTTAAGCA GAGAAGAGAT TTGGAAGAAG CTTTGGGACT ATGAT CTTGC AACTGTTATC 3 00 

GGATTCTTCG GTGTCTTTGT TTTAACCGCT TGTGTCTACT TCATGTCTCG TCCTCGCTCT 360 

GTTTATCTTA TTGATTTCGC TTGTTACAAG CCCTCCGATG AACACAAGGT GACAAAAGAA 420 

GAGTTCATAG AACTAGCGAG AAAATCAGGG AAGTTCGACG AAGAGACACT CGGTTTCAAG 480 

AAGAGGATCT TACAAGCCTC AGGCATAGGC GACGAGACAT ACGTCCCAAG ATCCATCTCT 54 0 

TCATCAGAAA ACATAACAAC GATGAAAGAA GGTCGTGAAG AAGCCTCTAC AGTGATCTTT 60 0 

GGAGCACTAG ACGAACTCTT CGAGAAGACA CGTGTAAAAC CTAAAGACGT TGGTGTCCTT 660 

GTGGTTAACT GTAGCATTTT CAACCCGACA CCGTCGTTGT CCGCAATGGT GATAAACCAT 72 0 

TACAAGATGA GAGGGAACAT ACTTAGTTAC AACCTTGGAG GGATGGGATG TTCGGCTGGA 7 80 

ATCATAGCTA TTGATCTTGC TCGTGACATG CTTCAGTCTA ACCCTAATAG TTATGCTGTT 840 

GTTGTGAGTA CTGAGATGGT TGGGTATAAT TGGTACGTGG GAAGTGACAA GTCAATGGTT 90 0 

ATACCTAATT GTTTCTTTAG GATGGGTTGT TCTGCCGTTA TGCTCTCTAA CCGTCGTCGT 96 0 

GACTTTCGCC ATGCTAAGTA CCGTCTCGAG CACATTGTCC GAACTCATAA GGCTGCTGAC 102 0 

GACCGTAGCT TCAGGAGTGT GTAC CAGGAA GAAGATGAAC AAGGATTCAA GGGGTTGAAG 1080 

ATAAGTAGAG ACTTAATGGA AGTTGGAGGT GAAGCTCTCA AGACAAACAT CACTACCTTA 114 0 

GGTCCTCTTG TCCTACCTTT CTCCGAGCAG CTTCTCTTCT TTGCTGCTTT GGTCCGCCGA 1200 

ACATTCTCAC CTGCTGCCAA AACGTCCACA ACCACTTCCT TCTCTACTTC CGCCACCGCA 1260 

AAAACCAATG GAATCAAGTC TTCCTCTTCC GATCTGTCCA AGC C AT AC AT CCCGGACTAC 132 0 

AAGCTCGCCT TCGAGCATTT TTGCTTCCAC GCGGCAAGCA AAGTAGTGCT TGAAGAGCTT 13 80 

CAAAAGAATC TAGGCTTGAG TGAAGAGAAT ATGGAGGCTT CTAGGATGAC ACTTCACAGG 144 0 

TTTGGAAACA CTTCTAGCAG TGGAATCTGG TATGAGTTGG CTTACATGGA GGCCAAGGAA 150 0 

AGTGTTCGTA GAGGCGATAG GGTTTGGCAG ATCGCTTTCG GTTCTGGTTT TAAGTGTAAC 156 0 

AGTGTGGTGT GGAAGGCAAT GAGGAAGGTG AAGAAGCCAA CCAGGAACAA TCCTTGGGTG 162 0 

GATTGCATCA ACCGTTACCC TGTGCCTCTC 1650 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 550 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 
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Thr 


Met 


Lys 


Glu 


Gly 


Arg 








180 










185 










190 


Glu 


Glu 


Ala 


Ser 


Thr 


Val 


lie 


Phe 


Gly 


Ala 


Leu 


Asp 


Glu 


Leu 


Phe 


Glu 






195 










200 








205 








Lys 


Thr 


Arg 


Val 


L ys 


Pro 


Lys 


Asp 


Val 


Gly 


Val 


Leu 


Val 


Val 


Asn 


Cys 




210 










215 










220 








Ser 


lie 


Phe 


Asn 


Pro 


Thr 


Pro 


Ser 


Leu 


Ser 


Ala 


Met 


Val 


He 


Asn 


His 


225 










230 










235 










240 


Tyr 


Lys 


Met 


Arg 


Gly 


Asn 


lie 


Leu 


Ser 


Tyr 


Asn 


Leu 


Gly 


Gly 


Met 


Gly 










245 










250 






255 


Cys 


Ser 


Ala 


Gly 


lie 


lie 


Ala 


He 


Asp 


Leu 


Ala 


Arg 


Asp 


Met 


Leu 


Gin 








260 










265 








270 






Ser 


Asn 


Pro 


Asn 


Ser 


Tyr 


Ala 


Val 


Val 


Val 


Ser 


Thr 


Glu 


Met 


Val 


Gly 






275 










280 










285 






Tyr 


Asn 


Trp 


Tyr 


Val 


Gly 


Ser 


Asp 


Lys 


Ser 


Met 


Val 


He 


Pro 


Asn 


Cys 




290 










295 










300 








Phe 


Phe 


Arg 


Met 


Gly 


Cys 


Ser 


Ala 


Val 


Met 


Leu 


Ser 


Asn 


Arg 


Arg 


Arg 


305 










310 










315 






320 


Asp 


Phe 


Arg 


His 


Ala 


Lys 


Tyr 


Arg 


Leu 


Glu 


His 


He 


Val 


Arg 


Thr 


His 










325 










330 








335 




Lys 


Ala 


Ala 


Asp 


Asp 


Arg 


Ser 


Phe 


Arg 


Ser 


Val 


Tyr 


Gin 


Glu 


Glu 


Asp 








340 










345 










350 




Glu 


Gin 


Gly 


Phe 


Lys 


Gly 


Leu 


Lys 


He 


Ser 


Arg 


Asp 


Leu 


Met 


Glu 


Val 






355 










360 






365 








Gly Gly 


Glu 


Ala 


Leu 


Lys 


Thr 


Asn 


He 


Thr 


Thr 


Leu 


Gly 


Pro 


Leu 


Val 




370 










375 










380 










Leu 


Pro 


Phe 


Ser 


Glu 


Gin 


Leu 


Leu 


Phe 


Phe 


Ala 


Ala 


Leu 


Val 


Arg 


Arg 


385 










390 










395 








400 


Thr 


Phe 


Ser 


Pro 


Ala 
405 


Ala 


Lys 


Thr 


Ser 


Thr 
410 


Thr 


Thr 


Ser 


Phe 


Ser 
415 


Thr 


Ser 


Ala 


Thr 


Ala 


Lys 


Thr 


Asn 


Gly 


He 


Lys 


Ser 


Ser 


Ser 


Ser 


Asp 


Leu 








420 










425 










430 




Ser 


Lys 


Pro 


Tyr 


lie 


Pro 


Asp 


Tyr 


Lys 


Leu 


Ala 


Phe 


Glu 


His 


Phe 


Cys 






435 










440 










445 






Phe 


His 


Ala 


Ala 


Ser 


Lys 


Val 


Val 


Leu 


Glu 


Glu 


Leu 


Gin 


Lys 


Asn 


Leu 




450 










455 










460 








Gly Leu 


Ser 


Glu 


Glu 


Asn 


Met 


Glu 


Ala 


Ser 


Arg 


Met 


Thr 


Leu 


His 


Arg 


465 










470 










475 










480 


Phe 


Gly 


Asn 


Thr 


Ser 
485 


Ser 


Ser 


Gly 


He 


Trp 
490 


Tyr 


Glu 


Leu 


Ala 


Tyr 
495 


Met 


Glu 


Ala 


Lys 


Glu 


Ser 


Val 


Arg 


Arg 


Gly 


Asp 


Arg 


Val 


Trp 


Gin 


He 


Ala 








500 










505 






510 






Phe 


Gly 


Ser 


Gly 


Phe 


Lys 


Cys 


Asn 


Ser 


Val 


Val 


Trp 


Lys 


Ala 


Met 


Arg 






515 










520 










525 






Lys 


Val 


Lys 


Lys 


Pro 


Thr 


Arg 


Asn 


Asn 


Pro 


Trp 


Val 


Asp 


Cys 


He 


Asn 




530 










535 










540 








Arg 


Tyr 


Pro 


Val 


Pro 


Leu 























545 550 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1611 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: Genomic DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

TC'GAGCTACG TCAGGGCTTT TATATGCACA AATTCTCATA AAGTTTTCAA TTTTATTCCA 60 

TTTTTCTCGG AAGC CATGGA AGCTGCTAAT GAGCCTGTTA ATGGCGGATC CGTACAGATC 12 0 

CGAACAGAGA ACAACGAAAG ACGAAAGCTT CCTAATTTCT TACAAAGCGT CAACATGAAA 18 0 

TACGTCAAGC TAGGTTATCA TTACCTCATT ACTCATCTCT TCAAGCTCTG TTTGGTTCCA 24 0 

TTAATGGCGG TTTTAGTCAC AGAGATCTCT CGATTAACAA CAGACGATCT TTACCAGATT 3 00 

TGGCTTCATC TCCAATACAA TCTCGTTGCT TTCATCTTTC TCTCTGCTTT AGCTATCTTT 36 0 

GGCTCCACCG TTTACATCAT GAGTCGTCCC AGATCTGTTT ATCTCGTTGA TTACTCTTGT 42 0 

TATCTTCCTC CGGAGAGTCT TCAGGTTAAG TATCAGAAGT TTATGGATCA TTCTAAGTTG 4 80 

ATTGAAGATT TCAATGAGTC ATCTTTAGAG TTTCAGAGGA AGATTCTTGA ACGTTCTGGT 54 0 

TTAGGAGAAG AGACTTATCT CCCTGAAGCT TTACATTGTA TCCCTCCGAG GCCTACGATG 600 

ATGGCGGCTC GTGAGGAATC TGAGCAGGTA ATGTTTGGTG CTCTTGATAA GCTTTTCGAG 660 

AATACCAAGA TTAACCCTAG GGATATTGGT GTGTTGGTTG TGAATTGTAG CTTGTTTAAT 72 0 

CCTACACCTT CGTTGTCAGC TATGATTGTT AACAAGTATA AGCTTAGAGG GAATGTTAAG 78 0 

AGTTTTAACC TTGGTGGAAT GGGGTGTAGT GC TGGTGTT A TCTCTATCGA TTTAGCTAAA 84 0 

GATATGTTGC AAGTTCATAG GAATACTTAT GCTGTTGTGG TTAGTACTGA GAACATTACT 90 0 

CAGAATTGGT ATTTTGGGAA TAAGAAGGCT ATGTTGATTC CGAATTGTTT GTTTCGTGTT 96 0 

GGTGGTTCGG CGATTTTGTT GTCGAACAAG GGGAAAGATC GTAGACGGTC TAAGTATAAG 102 0 

CTTGTTCATA CCGTTAGGAC T C AT AAAGGA GCTGTTGAGA AGGCTTTCAA CTGTGTTTAC 10 8 0 

CAAGAGCAAG ATGATAATGG GAAGACCGGG GTTTCGTTGT CGAAAGATCT TATGGCTATA 114 0 

GCTGGGGAAG CTCTTAAGGC GAATATCACT ACTTTAGGTC CTTTGGTTCT TCCTATAAGT 120 0 

GAG CAG ATTC TGTTTTTCAT GACTTTGGTT AC GAAGAAAC TGTTTAACTC GAAGCTGAAG 1260 

CCGTATATTC CGGATTTCAA GCTTGCGTTT GATCATTTCT GTATCCATGC TGGTGGTAGA 132 0 

GCTGTGATTG ATGAGCTTGA GAAGAATCTG CAGCTTTCGC AG ACT C ATGT CGAGGCATCC 13 80 

AGAATGACAC TGCACAGATT TGGAAACACT TCTTCGAGCT CGATTTGGTA TGAACTGGCT 144 0 

TACATAGAGG CTAAAGGTAG GATGAAGAAA GGAAACCGGG TTTGGCAGAT TGCTTTTGGA 15 0 0 

AGTGGGTTTA AGTGTAACAG TGCAGTTTGG GTGGCTCTAA ACAATGTCAA GCCTTCGGTT 156 0 

AGTAGTCCGT GGGAACACTG CATCGACCGA TAT CCGGTTA AGCTCGACTT C 1611 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



Ser 


Ser 


Tyr 


Val 


Arg 


Ala 


Phe 


He 


Cys 


Thr 


Asn 


Ser 


His 


Lys 


Val 


Phe 


1 








5 










10 










15 




Asn 


Phe 


He 


Pro 


Phe 


Phe 


Ser 


Glu 


Ala 


Met 


Glu 


Ala 


Ala 


Asn 


Glu 


Pro 








20 










25 










30 






Val 


Asn 


Gly Gly 


Ser 


Val 


Gin 


He 


Arg 


Thr 


Glu 


Asn 


Asn 


Glu 


Arg 


Arg 






35 










40 










45 








Lys 


Leu 


Pro 


Asn 


Phe 


Leu 


Gin 


Ser 


Val 


Asn 


Met 


Lys 


Tyr 


Val 


Lys 


Leu 




50 










55 










60 










Gly 


Tyr 


His 


Tyr 


Leu 


He 


Thr 


His 


Leu 


Phe 


Lys 


Leu 


Cys 


Leu 


Val 


Pro 


65 










70 










75 










80 


Leu 


Met 


Ala 


Val 


Leu 


Val 


Thr 


Glu 


He 


Ser 


Arg 


Leu 


Thr 


Thr 


Asp 


Asp 










85 










90 










95 




Leu 


Tyr 


Gin 


He 


Trp 


Leu 


His 


Leu 


Gin 


Tyr 


Asn 


Leu 


Val 


Ala 


Phe 


He 








100 










105 










110 






Phe 


Leu 


Ser 


Ala 


Leu 


Ala 


He 


Phe 


Gly 


Ser 


Thr 


Val 


Tyr 


He 


Met 


Ser 






115 










120 










125 








Arg 


Pro 


Arg 


Ser 


Val 


Tyr 


Leu 


Val 


Asp 


Tyr 


Ser 


Cys 


Tyr 


Leu 


Pro 


Pro 




130 










135 










140 










Glu 


Ser 


Leu 


Gin 


Val 


Lys 


Tyr 


Gin 


Lys 


Phe 


Met 


Asp 


His 


Ser 


Lys 


Leu 


145 










150 










155 










160 


He 


Glu 


Asp 


Phe 


Asn 


Glu 


Ser 


Ser 


Leu 


Glu 


Phe 


Gin 


Arg 


Lys 


He 


Leu 










165 










170 










175 




Glu 


Arg 


Ser 


Gly 


Leu 


Gly 


Glu 


Glu 


Thr 


Tyr 


Leu 


Pro 


Glu 


Ala 


Leu 


His 



180 185 190 
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Cys 


He 


Pro 


Pro 


Arg 


Pro 


Thr 


Met 


Met 


Ala 


Ala 


Arg 


Glu 


Glu 


Ser 


Glu 






195 










200 








205 








Gin 


Val 


Met 


Phe 


Gly 


Ala 


Leu 


Asp 


Lys 


Leu 


Phe 


Glu 


Asn 


Thr 


Lys 


He 




210 










215 










220 








Asn 


Pro 


Arg 


Asp 


He 


Gly 


Val 


Leu 


Val 


Val 


Asn 


Cys 


Ser 


Leu 


Phe 


Asn 


225 










230 










235 










240 


Pro 


Thr 


Pro 


Ser 


Leu 


Ser 


Ala 


Met 


He 


Val 


Asn 


Lys 


Tyr 


Lys 


Leu 


Arg 










245 










250 








255 


Gly Asn 


Val 


Lys 


Ser 


Phe 


Asn 


Leu 


Gly 


Gly 


Met 


Gly 


Cys 


Ser 


Ala 


Gly 








260 










265 










270 






Val 


He 


Ser 


He 


Asp 


Leu 


Ala 


Lys 


Asp 


Met 


Leu 


Gin 


Val 


His 


Arg 


Asn 






275 










280 










285 






Thr 


Tyr 


Ala 


Val 


Val 


Val 


Ser 


Thr 


Glu 


Asn 


He 


Thr 


Gin 


Asn 


Trp 


Tyr 




290 










295 










300 






Phe 


Gly 


Asn 


Lys 


Lys 


Ala 


Met 


Leu 


He 


Pro 


Asn 


Cys 


Leu 


Phe 


Ara 


Val 


305 










310 










315 








320 


Gly Gly 


Ser 


Ala 


He 


Leu 


Leu 


Ser 


Asn 


Lys 


Gly 


Lys 


Asp 


Arg 


Arg 


Arg 










325 










330 










335 




Ser 


Lys 


Tyr 


Lys 
340 


Leu 


Val 


His 


Thr 


Val 
345 


Arg 


Thr 


His 


Lys 


Gly 
350 


Ala 


Val 


Glu 


Lys 


Ala 
355 


Phe 


Asn 


Cys 


Val 


Tyr 
360 


Gin 


Glu 


Gin 


Asp 


Asp 
365 


Asn 


Gly 


Lys 


Thr Gly 


Val 


Ser 


Leu 


Ser 


Lys 


Asp 


Leu 


Met 


Ala 


He 


Ala 


Gly 


Glu 


Ala 




370 










375 










380 








Leu 


Lys 


Ala 


Asn 


He 


Thr 


Thr 


Leu 


Gly 


Pro 


Leu 


Val 


Leu 


Pro 


He 


Ser 


385 










390 










395 










400 


Glu 


Gin 


He 


Leu 


Phe 
405 


Phe 


Met 


Thr 


Leu 


Val 
410 


Thr 


Lys 


Lys 


Leu 


Phe 
415 


Asn 


Ser 


Lys 


Leu 


Lys 
420 


Pro 


Tyr 


He 


Pro 


Asp 
425 


Phe 


L Y S 


Leu 


Ala 


Phe 
430 


Asp 


His 


Phe 


Cys 


He 
435 


His 


Ala 


Gly 


Gly 


Arg 
440 


Ala 


Val 


He 


Asp 


Glu 
445 


Leu 


Glu 


Lys 


Asn 


Leu 
450 


Gin 


Leu 


Ser 


Gin 


Thr 
455 


His 


Val 


Glu 


Ala 


Ser 
460 


Arg 


Met 


Thr 


Leu 


His 


Arg 


Phe 


Gly 


Asn 


Thr 


Ser 


Ser 


Ser 


Ser 


He 


Trp 


Tyr 


Glu 


Leu 


Ala 


465 










470 










475 










480 


Tyr 


He 


Glu 


Ala 


Lys 
485 


Gly 


Arg 


Met 


Lys 


Lys 
490 


Gly 


Asn 


Arg 


Val 


Trp 
495 


Gin 


He 


Ala 


Phe 


Gly 
500 


Ser 


Gly 


Phe 


Lys 


Cys 
505 


Asn 


Ser 


Ala 


Val 


Trp 
510 


Val 


Ala 


Leu 


Asn 


Asn 


Val 


Lys 


Pro 


Ser 


Val 


Ser 


Ser 


Pro 


Trp 


Glu 


His 


Cys 


He 






515 










520 








525 






Asp 


Arg 


Tyr 


Pro 


Val 


Lys 


Leu 


Asp 


Phe 

















530 535 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1502 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TCTCCGACGA TGCCTCAGGC ACCGATGCCA GAGTTCTCTA GCTCGGTGAA GCTCAAGTAC 6 0 

GTGAAACTTG GTTACCAATA TTTGGTTAAC CATTTCTTGA GTTTTCTTTT GATCCCGATC 120 

ATGGCTATTG TCGCCGTTGA GCTTCTTCGG ATGGGTCCTG AAGAGATCCT TAATGTTTGG 18 0 

AATTCACTCC AGTTTGACCT AGTTCAGGTT CTATGTTCTT CCTTCTTTGT CATCTTCATC 24 0 

TCCACTGTTT ACTTCATGTC CAAGCCACGC ACCATCTACC TCGTTGACTA TTCTTGTTAC 3 00 

AAGCCACCTG TCACGTGTCG TGTCCCCTTC GCAACTTTCA TGGAACACTC TCGTTTGATC 360 

CTCAAGGACA AGCCTAAGAG CGTCGAGTTC CAAATGAGAA TCCTTGAACG TTCTGGCCTC 42 0 

GGTGAGGAGA CTTGTCTCCC TCCGGCTATT CATTATATTC CTCCCACACC AACCATGGAC 480 

GCGGCTAGAA GCGAGGCTCA GATGGTTATC TTCGAGGCCA TGGACGATCT TTTCAAGAAA 54 0 

ACCGGTCTTA AAC CTAAAG A CGTCGACATC CTTATCGTCA ACTGCTCTCT TTTCTCTCCC 600 
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ACACCATCGC TCTCAGCTAT GGTCATCAAC AAATATAAGC TTAGGAGTAA TATCAAGAGC 6 60 

TTCAATCTTT CGGGGATGGG CTGCAGCGCG GGCCTGATCT CAGTTGATCT AGCCCGCGAC 720 

TTGCTCCAAG TTCATCCCAA TTCAAATGCA ATCATCGTCA GCACGGAGAT CATAACGCCT 7 80 

AATTACTATC AAGGCAACGA GAGAGCCATG TTGTTACCCA ATTGTCTCTT CCGCATGGGT 840 

GCGGCAGCCA TACACATGTC AAACCGCCGG TCTG AC CGGT GGCGAGCCAA ATACAAGCTT 900 

TCCCACCTCG TCCGGACACA CCGTGGCGCT GACGACAAGT CTTTCTACTG TGTCTACGAA 960 

CAGGAAGACA AAGAAGGACA CGTTGGCATC AACTTGTCCA AAGATCTCAT GGCCATCGCC 102 0 

GGTGAAGCCC TCAAGGCAAA CATCACCACA ATAGGTCCTT TGGTCCTACC GGG<3TCAGAA 10 80 

CAACTTCTCT TCCTCACGTC CCTAATCGGA CGTAAAATCT TCAACCCGAA ATGGAAACCA 1140 

TACATACCGG ATTTCAAGCT GGCCTTCGAA CACTTTTGCA TTCACGCAGG ,AGGCAGAGCG 12 00 

GTGATCGACG AGCTCCAAAA GAATCTACAA CTATCAGGAG AACACGTTGA GGCCTCAAGA 1260 

ATGACACTAC ATCGTTTTGG (TAACACGTCA TCTTCATCGT TATGGTAQGA GCTTAGCTAC 132 0 

ATCGAGTCTA AAGGGAGAAT GAGGAGAGGC GATCGCGTTT JaGGAAA*fCGC GTTTGGGAGT 13 8 0 

GGTTTCAAGT GTAACTCTGC CGTGTGGAAG TGTAACCGTA*" CGATTAAGAC ACCTAAGGAC 144 0 

GGACCATGGT CCGATTGTAT CGACCGTTAC CCTGTCTTTA TTCCCGAAGT TGTCAAACTC 150 0 

TA 1502 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 : 



Ser 


Pro 


Thr 


Met 


Pro 


Gin 


Ala 


Pro 


Met 


Pro 


Glu 


Phe 


Ser 


Ser 


Ser 


Val 


1 








5 










10 










15 




Lys 


Leu 


Lys 


Tyr 


Val 


Lys 


Leu 


Gly 


Tyr 


Gin 


Tyr 


Leu 


Val 


Asn 


His 


Phe 








20 










25 










30 






Leu 


Ser 


Phe 


Leu 


Leu 


He 


Pro 


He 


Met 


Ala 


He 


Val 


Ala 


Val 


Glu 


Leu 






35 










40 










45 








Leu 


Arg 


Met 


Gly 


Pro 


Glu 


Glu 


He 


Leu 


Asn 


Val 


Trp 


Asn 


Ser 


Leu 


Gin 




50 










55 










60 










Phe 


Asp 


Leu 


Val 


Gin 


Val 


Leu 


Cys 


Ser 


Ser 


Phe 


Phe 


Val 


He 


Phe 


He 


65 










70 










75 










80 


Ser 


Thr 


Val 


Tyr 


Phe 


Met 


Ser 


Lys 


Pro 


Arg 


Thr 


He 


Tyr 


Leu 


Val 


Asp 










85 










90 










95 




Tyr 


Ser 


Cys 


Tyr 


Lys 


Pro 


Pro 


Val 


Thr 


Cys 


Arg 


Val 


Pro 


Phe 


Ala 


Thr 








100 










105 










110 






Phe 


Met 


Glu 


His 


Ser 


Arg 


Leu 


He 


Leu 


Lys 


Asp 


Lys 


Pro 


Lys 


Ser 


Val 






115 










120 










125 








Glu 


Phe 


Gin 


Met 


Arg 


He 


Leu 


Glu 


Arg 


Ser 


Gly 


Leu 


Gly 


Glu 


Glu 


Thr 




130 










135 










140 










Cys 


Leu 


Pro 


Pro 


Ala 


lie 


His 


Tyr 


He 


Pro 


Pro 


Thr 


Pro 


Thr 


Met 


Asp 


145 










150 










155 










160 


Ala 


Ala 


Arg 


Ser 


Glu 


Ala 


Gin 


Met 


Val 


He 


Phe 


Glu 


Ala 


Met 


Asp 


Asp 










165 










170 










175 




Leu 


Phe 


Lys 


Lys 


Thr 


Gly 


Leu 


Lys 


Pro 


Lys 


Asp 


Val 


Asp 


He 


Leu 


He 








180 










185 










190 






Val 


Asn 


Cys 


Ser 


Leu 


Phe 


Ser 


Pro 


Thr 


Pro 


Ser 


Leu 


Ser 


Ala 


Met 


Val 






195 










200 










205 








lie 


Asn 


Lys 


Tyr 


Lys 


Leu 


Arg 


Ser 


Asn 


He 


Lys 


Ser 


Phe 


Asn 


Leu 


Ser 




210 










215 










220 










Gly 


Met 


Gly 


Cys 


Ser 


Ala 


Gly 


Leu 


He 


Ser 


Val 


Asp 


Leu 


Ala 


Arg 


Asp 


225 










230 










235 










240 


Leu 


Leu 


Gin 


Val 


His 


Pro 


Asn 


Ser 


Asn 


Ala 


He 


He 


Val 


Ser 


Thr 


Glu 










245 










250 










255 




lie 


He 


Thr 


Pro 


Asn 


Tyr 


Tyr 


Gin 


Gly 


Asn 


Glu 


Arg 


Ala 


Met 


Leu 


Leu 








260 










265 










270 






Pro 


Asn 


Cys 


Leu 


Phe 


Arg 


Met 


Gly 


Ala 


Ala 


Ala 


He 


His 


Met 


Ser 


Asn 






275 










280 










285 








Arg 


Arg 


Ser 


Asp 


Arg 


Trp 


Arg 


Ala 


Lys 


Tyr 


Lys 


Leu 


Ser 


His 


Leu 


Val 



290 295 300 
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Arg 


Thr 


His 


Arg 


Gly 


Ala 


Asp 


Asp 


Lys 


Ser 


Phe 


Tyr 


Cys 


Val 


Tyr 


Glu 












J JLU 










315 






320 


Gin 


Glu 


Asp 


Lys 


G1U 


Gly 


His 


Val 


Gly 


He 


Asn 


Leu 


Ser 


Lys 


Asp 


Leu 




















330 








335 




Met 


Ala 


lie 


Ala 


Gly 


Glu 


Ala 


Leu 


Lys 


Ala 


Asn 


He 


Thr 


Thr 


He 


Gly 


















345 










350 




Pro 


Leu 


Val 
355 


Leu 


Pro 


Ala 


Ser 


Glu 
360 


Gin 


Leu 


Leu 


Phe 


Leu 
365 


Thr 


Ser 


Leu 


lie 


Gly 


Arg 


Lys 


lie 


Phe 


Asn 


Pro 


Lys 


Trp 


Lys 


Pro 


Tyr 


He 


Pro 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 13 : 

ATGGACGGTG CCGGAGAATC ACGACTCGGT GGTGATGGTG GTGGTGATGG TTCTGTTGGA 60 

GTTCAGATCC GACAAACACG GATGCTACCG GATTTTCTCC AGAGCGTGAA TCT CAAGTAT 12 0 

GTGAAATTAG GTTAC CATTA CTTAATCTCA AATCTCTTGA CTCTCTGTTT ATTCCCTCTC 18 0 

GCCGTTGTTA TCTCCGTCGA AGCCTCTCAG ATGAACCCAG ATGATCTCAA ACAGCTCTGG 24 0 

ATCCATCTAC AATACAATCT GGTTAGTATC ATCATCTGTT CAGCGATTCT AGTCTTCGGG 300 

TTAACGGTTT ATGTTATGAC CCGACCTAGA CCCGTTTACT TGGTTGATTT CTCTTGTTAT 360 

CTCCCACCTG ATCATCTCAA AGCTCCTTAC GCTCGGTTCA TGGAACATTC TAGACTCACC 42 0 

GGAGATTTCG ATGACTCTGC TCTCGAGTTT CAACGCAAGA TCCTTGAGCG TTCTGGTTTA 480 

GGGGAAGACA CTTATGTCCC TGAAGCTATG CATTATGTTC CACCGAGAAT TTCAATGGCT 54 0 

GCTGCTAGAG AAGAAGCTGA ACAAGTCATG TTTGGTGCTT TAGATAACCT TTTCGCTAAC 600 

ACTAATGTGA AACCAAAGGA TATTGGAATC CTTGTTGTGA ATTGTAGTCT CTTTAATCCA 660 

ACTCCTTCGT TATCTGCAAT GATTGTGAAC AAGTATAAGC TTAGAGGTAA CATTAGAAGC 72 0 

TACAATCTAG GCGGTATGGG TTGCAGCGCG GGAGTTATCG CTGTGGATCT TGCTAAAGAC 78 0 

ATGTTGTTGG TACATAGGAA CACTTATGCG GTTGTTGTTT CTACTGAGAA CATTACTCAG 84 0 

AATTGGTATT TTGGTAACAA GAAATCGATG TTGATAC CGA ACTGCTTGTT TCGAGTTGGT 900 

GGCTCTGCGG TTTTGCTATC GAACAAGTCG AGGGACAAGA GACGGTCTAA GTACAGGCTT 960 

GTACATGTAG TCAGGACTCA CCGTGGAGCA GATGATAAAG CTTTC CGTTG TGTTTATCAA 102 0 

GAGCAGGATG ATACAGGGAG AACCGGGGTT TCGTTGTCGA AAGATCTAAT GGCGATTGCA 10 80 

GGGGAAACTC TCAAAACCAA TATCACTACA TTGGGTCCTC TTGTTCTACC GATAAGTGAG 114 0 

CAGATTCTCT TCTTTATGAC TCTAGTTGTG AAGAAGCTCT TTAACGGTAA AGTGAAACCG 120 0 

TATATCCCGG ATTTCAAACT TGCTTTCGAG CATTTCTGTA TCCATGCTGG TGGAAGAGCT 1260 

GTGATCGATG AGTTAGAGAA GAATCTGCAG CTTTC AC CAG TTCATGTCGA GGCTTCGAGG 132 0 

ATGACTCTTC ATCGATTTGG T AAC AC AT C T TCGAGCTCCA TTTGGTATGA ATTGGCTTAC 13 80 

ATTGAAGCGA AGGGAAGGAT GCGAAGAGGT AATCGTGTTT GGCAAATCGC GTTCGGAAGT 144 0 

GGATTTAAAT GTAATAGCGC GATTTGGGAA GCATTAAGGC ATGTGAAACC TTCGAACAAC 15 0 0 

AGTCCTTGGG AAGATTGTAT TGACAAGTAT CCGGTAACTT TAAGTTAT 154 8 

(2) INFORMATION FOR SEQ ID NO:14: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 516 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
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Thr Ser Sex Ser Ser lie Trp Tyr Glu Leu Ala Tyr lie Glu Ala Lys 

450 455 460 

Gly Arg Met Arg Arg Gly Asn Arg Val Trp Gin lie Ala Phe Gly Ser 
465 470 475 480 

Gly Phe Lys Cys Asn Ser Ala He Trp Glu Ala Leu Arg His Val Lys 

485 490 495 

Pro Ser Asn Asn Ser Pro Trp Glu Asp Cys He Asp Lys Tyr Pro Val 

500 505 510 

Thr Leu Ser Tyr 
515 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide encoding a polypeptide 
having an amino acid sequence selected from the group 
consisting of: an amino acid sequence substantially 
identical to SEQ ID NO: 2, an amino acid sequence 
substantially identical to SEQ ID NO: 4, an amino acid 
sequence substantially identical to SEQ ID NO: 6, an amino 
acid sequence substantially identical to SEQ ID NO: 8, an 
amino acid sequence substantially identical to SEQ ID 

NO: 10, an amino acid sequence substantially identical to 
SEQ ID NO: 12, and an amino acid sequence substantially 
identical to SEQ ID NO: 14. 

2. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO : 2 . 

3. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 4. 

4. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 6. 

5. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 8. 

6. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 10* 

7. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 12. 

8. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 14. 
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9. An isolated polynucleotide, wherein said 

polynucleotide is selected from the group consisting of 

a) 
b) 
c) 
d) 
e) 

f ; 
g) 

i ) 
j 

i; 

m) 
n) 

o) a polynucleotide having a nucleic acid sequence 
complementary to a) , b) , c) , d) , e) , f ) , g) , h) , i) , j ) , 
k) , 1) , m) , or n) ; and 

p) a nucleic acid fragment of a) , b) , c) , d) , e) , 
f ) / g) , h) , i) , j), k) , 1), m) , n) , or o) that is at 
least 15 nucleotides in length and that hybridizes under 
stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO : 6 , SEQ 
ID NO: 8, SEQ ID NO: 10, SEQ ID NO : 12 , or SEQ ID NO: 14. 
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10. An isolated polypeptide having an amino acid 

sequence selected from the group consisting of: an amino 
acid sequence substantially identical to SEQ ID NO: 2, an 
amino acid sequence substantially identical to SEQ ID 
NO:4, an amino acid sequence substantially identical to 
SEQ ID NO: 6 , an amino acid sequence substantially 
identical to SEQ ID NO: 8, an amino acid sequence 
substantially identical to SEQ ID NO: 10, an amino acid 
sequence substantially identical to SEQ ID NO: 12, and an 
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amino acid sequence substantially identical to SEQ ID 
NO: 14. 

11. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID N0:2. 

12. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO: 4. 

13. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO : 6 . 

14. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO: 8. 

15. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO: 10. 

16. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO: 12. 

17. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO: 14. 

IS. A transgenic plant containing a nucleic acid 

construct comprising a polynucleotide selected from the 
group consisting of: 

a) SEQ ID NO:l, 

b) SEQ ID NO: 3 

c) SEQ ID NO: 5, 

d) SEQ ID NO: 7, 

e) SEQ ID NO: 9, 

f) SEQ ID NO: 11; 

g) SEQ ID NO: 13; 

h) an RNA analog of SEQ ID NO : 1 ; 
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i) an RNA analog of SEQ ID NO: 3 
j) an RNA analog of SEQ ID NO: 5, 
k) an RNA analog of SEQ ID NO: 7, 
1) an RNA analog of SEQ ID NO: 9, 
m) an RNA analog of SEQ ID NO: 11; 
n) an RNA analog of SEQ ID NO: 13; 

o) a polynucleotide having a nucleic acid sequence 
complementary to a) , b) , c) , d) , e) , f ) , g) , h) , i) , j) , 
k) , 1) , m) , or n) ; and 

p) a nucleic acid fragment of a) , b) , c) , d) , e) , 
f ) , g) , h) , i) , j), k) , 1), m) , n) , or o) that is at 
least 15 nucleotides in length and that hybridizes under 
stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID NO:2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ 
ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID NO: 14. 



19. The plant of claim 18, wherein said construct 
further comprises a regulatory element operably linked to 
said polynucleotide. 

20. The plant of claim 19, wherein said regulatory 
element is a tissue-specif ic promoter, 

21. The plant of claim 20, wherein said regulatory 
element is an epidermal cell-specific promoter. 

22. The plant of claim 20, wherein said regulatory 
element is a seed- specif ic promoter that is operably 
linked in sense orientation to said polynucleotide. 



23. The plant of claim 22, wherein said plant has 

altered levels of very long chain fatty acids in seeds 
compared to the levels in a plant lacking said nucleic 
acid construct. 
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24 . A transgenic plant containing a nucleic acid 

construct comprising a polynucleotide encoding a 
polypeptide selected from the group consisting of: an 
amino acid sequence substantially identical to SEQ ID 
NO: 2, an amino acid sequence substantially identical to 
SEQ ID N0:4, an amino acid sequence substantially 
identical to SEQ ID NO: 6, an amino acid sequence 
substantially identical to SEQ ID NO: 8, an amino acid 
sequence substantially identical to SEQ ID NO: 10, an 
amino acid sequence substantially identical to SEQ ID 
NO: 12, and an amino acid sequence substantially identical 
to SEQ ID NO: 14 . 

25. The plant of claim 24, wherein said construct 
further comprises a regulatory element operably linked to 
said polynucleotide. 

26. The plant of claim 25, wherein said regulatory 
element is a tissue-specific promoter. 

27. The plant of claim 26, wherein said regulatory 
element is an epidermal cell -specif ic promoter. 

28. The plant of claim 26, wherein said regulatory 
element is a seed-specific promoter that is operably 
linked in sense orientation to said polynucleotide. 

29. The plant of claim 28, wherein said plant has 
altered levels of very long chain fatty acids in seeds 
compared to the levels in a plant lacking said nucleic 
acid construct. 

30. A method of altering the levels of very long chain 
fatty acids in a plant, comprising the steps of: 
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A) creating a nucleic acid construct, said construct 
comprising a polynucleotide selected from the group 
consisting of: 

a) SEQ ID N0:1, 

b) SEQ ID NO: 3, 

c) SEQ ID NO:5 ( 

d) SEQ ID NO: 7, 

e) SEQ ID NO: 9; 

f) SEQ ID NO: 11; 

g) SEQ ID NO: 13; 

h) an RNA analog of SEQ ID NO:l 

i) an RNA analog of SEQ ID NO: 3 
j) an RNA analog of SEQ ID NO: 5 
k) an RNA analog of SEQ ID NO: 7, 
1) an RNA analog of SEQ ID NO:9 ( 
tn) an RNA analog of SEQ ID NO: 11; 
n) an RNA analog of SEQ ID NO: 13; 

o) a polynucleotide having a nucleic acid sequence 
complementary to a) , b) , c) , d) , e) , f ) , g) , h) , i) , j) , 
k) , 1) , m) , or n) ; and 

p) a nucleic acid fragment of a) , b) , c) , d) , e) , 
f ) , g) / h) , i) , j), k) , 1) , m) , n) , or o) that is at 
least 15 nucleotides in length and that hybridizes under 
stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID N0:2, SEQ ID NO:4, SEQ ID NO: 6, SEQ 
ID NO: 8, SEQ ID NO: 10, SEQ ID NO : 12 , or SEQ ID NO: 14; and 

B) introducing said construct into said plant, wherein 
said polynucleotide is effective for altering the levels 
of very long chain fatty acids in said plant. 
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FATTY ACID ELQNGASES 
Field of the Invention 
5 This invention relates to fatty acid elongase 

complexes and nucleic acids encoding elongase proteins. 
More particularly, the invention relates to nucleic acids 
encoding IS-keto acyl synthase proteins that are effective 
for producing very long chain fatty acids, polypeptides 
10 produced from such nucleic acids and transgenic plants 
expressing such nucleic acids. 

Background of the Invention 
Plants are known to synthesize very long chain 
fatty acids (VLCFAs) . VLCFAs are saturated or 
15 unsaturated monocarboxylic acids with an unbranched even- 
numbered carbon chain that is greater than 18 carbons in 
length. Many VLCFAs are 20-32 carbons in length, but 
VLCFAs can be up to 6 0 carbons in length. Important 
VLCFAs include erucic acid (22:1, i.e., a 22 carbon chain 
20 with one double bond), nervonic acid (24:1), behenic acid 
(22:0), and arachidic acid (20:0). 

Plant seeds accumulate mostly 16- and 18-carbon 
fatty acids. VLCFAs are not desirable in edible oils. 
Oilseeds of the Crucifereae (e.g., rapeseed) and a few 

2 5 other plants, however, accumulate C20 and C22 fatty acids 

(FAs) . Although plant breeders have developed rapeseed 
lines that have low levels of VLCFAs for edible oil 
purposes, even lower levels would be desirable. On the 
other hand, vegetable oils having elevated levels of 

3 0 VLCFAs are desirable for certain industrial uses, 

including uses as lubricants, fuels and as a feedstock 
for plastics, pharmaceuticals and cosmetics. 
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The biosynthesis of saturated fatty acids up to an 
18 -carbon chain occurs in the chloroplast. C2 units from 
acyl thioesters are linked sequentially, beginning with 
the condensation of acetyl Coenzyme A (CoA) and malonyl 
5 acyl carrier protein (ACP) to form a C4 acyl fatty acid. 
This condensation reaction is catalyzed by a £-ketoacyl 
synthase III (KASIII) . S-ketoacyl moieties are also 
referred to as 3 -ketoacyl moieties. 

The enzyme S- ketoacyl synthase I (KASI) is 
10 involved in the addition of C2 groups to form the 06 to 
C16 saturated fatty acids. KASI catalyzes the stepwise 
condensation of a fatty acyl moiety (C4 to C14 ) with 
malonyl -ACP to produce a 3 -ketoacyl -ACP product that is 2 
carbons longer than the substrate . The last condensation 
15 reaction in the chloroplast, converting C16 to C18, is 
catalyzed by S-ketoacyl synthase II (KASI I) . 

Each elongation cycle involves three additional 
enzymatic steps in addition to the condensation reaction 
as discussed above . Briefly, the 6- ketoacyl condensation 

2 0 product is reduced to S-hydroxyacyl -ACP , dehydrated to 

the enoyl-ACP, and finally reduced to a fully reduced 
acyl -ACP. The fully reduced fatty acyl -ACP reaction 
product then serves as the substrate for the next cycle 
of elongation . 

25 The C18 saturated fatty acid (stearic acid, 18:0) 

can be transported out of the chloroplast and converted 
to the monounsaturate C18:l (oleic acid) , and the 
polyunsaturates C18 : 2 (linoleic acid) and C18 : 3 (ce- 
linolenic acid) . C18:0 and C18:l can also be elongated 

30 outside the chloroplast to form VLCFAs . The formation of 
VLCFAs involves the sequential condensation of two carbon 
groups from malonyl CoA with a C18:0 or C18:l fatty acid 
substrate. Elongation of fatty acids longer than 18 
carbons depends on the activity of a fatty acid elongase 

3 5 complex to carry out four separate enzyme reactions 
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similar to those described above for fatty acid synthesis 
in the chloroplast. Fehling, Biochem. Biophys. Acta 
1082:239-246 (1991). In plants, elongase complexes are 
distinct from fatty acid synthases since elongases are 
5 extraplastidial and membrane bound. 

Mutations have been identified in an Arabidopsis 
gene associated with fatty acid elongation. This gene, 
designated the FAE1 gene, is involved in the condensation 
step of an elongation cycle. See, WO 96/13582, 

10 incorporated herein by reference. Plants carrying a 

mutation in FAE1 have significant decreases in the levels 
of VLCFAs in seeds. Genes associated with wax 
biosynthesis in jojoba have also been cloned and 
sequenced (WO 95/153 87, incorporated herein by 

15 reference) . 

Very long chain fatty acids are key components of 
many biologically important compounds in animals, plants, 
and microorganisms. For example, in animals, the VLCFA 
arachidonic acid is a precursor to many prostaglandins. 

2 0 In plants VLCFAs are major constituents of 

triacylglycerols in many seed oils, are essential 
precursors for cuticular wax production, and are utilized 
in the synthesis of glycosylceramides , an important 
component of the plasma membrane . 
25 Obtaining detailed information on the biochemistry 

of KAS enzymes has been hampered by the difficulties 
encountered when purifying membrane bound enzymes. 
Although elongase activities have been partially purified 
from a number of sources, or studied using cell 

3 0 fractions, the elucidation of the biochemistry of 

elongase complexes has been hampered by the complexity of 
the membrane fractions used as the enzyme source. For 
example, until recently, it was unclear as to whether 
plant elongase complexes were composed of a 
35 multifunctional polypeptide similar to the FAS found in 
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animals and yeast, or if the complexes existed as 
discrete and dissociable enzymes similar to the FAS of 
plants and bacteria. Partial purification of an elongase 
KAS, immunoblot identification of the hydroxy acyl 
5 dehydrase, and the recent cloning of a KAS gene (FAE1) 
suggest that the enzyme activities of elongase complexes 
exist on individual enzymes. 

S umma r v of the I nve n t i on 
The invention disclosed herein relates to an 

10 isolated polynucleotide selected from one of the 

following: SEQ ID N0:1; SEQ ID NO: 3; SEQ ID NO : 5 ; SEQ ID 
NO: 7; SEQ ID NO : 9 ; SEQ ID NO: 11; SEQ ID NO : 13 ; an RNA 
analog of SEQ ID NO : 1 , 3, 5, 7, 9, 11, 13, or 15; and a 
polynucleotide having a nucleic acid sequence 

15 complementary to one of the above. The polynucleotide 
can also be a nucleic acid fragment of one of the above 
sequences that is at least 15 nucleotides in length and 
that hybridizes under stringent conditions to genomic DNA 
encoding the polypeptide of SEQ ID NO:2, SEQ ID NO:4, SEQ 

20 ID N0:6, SEQ ID NO : 8 , SEQ ID NO: 10, SEQ ID NO:12, or SEQ 
ID NO: 14 . 

Also disclosed herein is an isolated polypeptide 
that has an amino acid sequence substantially identical 
to one of the following: SEQ ID NO : 2 , SEQ ID NO : 4 , SEQ ID 

25 NO: 6, SEQ ID NO : 8 , SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID 
NO: 14. Also disclosed are isolated polynucleotides 
encoding polypeptides substantially identical in their 
amino acid sequence to: SEQ ID NO: 2, SEQ ID NO : 4 , SEQ ID 
NO: 6, SEQ ID NO : 8 , SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID 

3 0 NO: 14. 

The invention also relates to a transgenic plant 
containing a nucleic acid construct. The nucleic acid 
construct comprises a polynucleotide described above. 
The construct further comprises a regulatory element 
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operably linked to the polynucleotide. The regulatory 
element may a tissue-specific promoter, for example, an 
epidermal cell -specif ic promoter or a seed-specific 
promoter. The regulatory element may be operably linked 
5 to the polynucleotide in sense or antisense orientation. 
The plant has altered levels of very long chain fatty 
acids in tissues where the polynucleotide is expressed; 
compared to a parental plant lacking the nucleic acid 
construct . 

10 A method is disclosed for altering the levels of 

very long chain fatty acids in a plant. The method 
comprises the steps of creating a nucleic acid construct 
and introducing the construct into the plant . The 
construct includes a polynucleotide selected from one of 

15 the following: SEQ ID NO : 1 ; SEQ ID NO: 3; SEQ ID NO : 5 ; SEQ 
ID NO: 7; SEQ ID NO : 9 ; SEQ ID NO: 11; SEQ ID NO: 13; an RNA 
analog of SEQ ID NO : 1 , 3, 5, 7, 9, 11, 13, or 15; and a 
polynucleotide having a nucleic acid sequence 
complementary to one of the above. The polynucleotide 

2 0 can also be a nucleic acid fragment of one of the above 

that is at least 15 nucleotides in length and that 
hybridizes under stringent conditions to genomic DNA 
encoding the polypeptide of SEQ ID NO: 2, SEQ ID NO: 4, SEQ 
ID NO: 6, SEQ ID NO : 8 , SEQ ID NO: 10, SEQ ID NO : 12 , or SEQ 
25 ID NO: 14. The polynucleotide is effective for altering 
the levels of very long chain fatty acids in the plant. 

Other features and advantages of the invention 
will be apparent from the following description of the 
preferred embodiments thereof, and from the claims. 

3 0 Brief Description of the Drawings 

Figure 1 shows the time course of in vitro VLCFA 
synthesis by FAE1 expressed in yeast, with 3 different 
acyl-CoA substrates . 
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Figure 2 shows the rates of in vitro VLCFA 
synthesis and the VLCFA profiles of FAE1 expressed in 
yeast, with 3 different acyl-CoA substrates. 

Figure 3 shows the nucleotide sequence of the 
5 coding region of the Arabidopsis ELI polynucleotide (SEQ 
ID N0;1) . 

Figure 4 shows the deduced amino acid sequence 
(SEQ ID N0:2) for the ELI coding sequence of Figure 3. 

Figure 5 shows the nucleotide sequence of the 
10 coding region of the Arabidopsis EL2 polynucleotide (SEQ 
ID NO: 3) . 

Figure 6 shows the deduced amino acid sequence 
(SEQ ID NO: 4) for the EL2 coding sequence of Figure 5. 

Figure 7 shows the nucleotide sequence of the 
15 coding region of the Arabidopsis EL3 polynucleotide (SEQ 
ID NO : 5 ) . 

Figure 8 shows the deduced amino acid sequence 
(SEQ ID NO: 6) for the EL3 coding sequence of Figure 7. 

Figure 9 shows the nucleotide sequence of the 

2 0 coding region of the Arabidopsis EL4 polynucleotide (SEQ 

ID NO: 7) . 

Figure 10 shows the deduced amino acid sequence 
(SEQ ID NO: 8) for the EL4 coding sequence of Figure 9. 

Figure 11 shows the nucleotide sequence of the 
25 coding region of the Arabidopsis ELS polynucleotide (SEQ 
ID NO: 9) . 

Figure 12 shows the deduced amino acid sequence 
(SEQ ID NO: 10) for the ELS coding sequence of Figure 11. 
Figure 13 shows the nucleotide sequence of the 

3 0 coding region of the Arabidopsis ELS polynucleotide (SEQ 

ID NO:ll) . 

Figure 14 shows the deduced amino acid sequence 
(SEQ ID NO: 12) for the EL6 coding sequence of Figure 13, 
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Figure 15 shows the nucleotide sequence of the 
coding region of the Arabidopsis EL7 polynucleotide (SEQ 
ID NO: 13) . 

Figure 16 shows the deduced amino acid sequence 
(SEQ ID NO: 14) for the EL7 coding sequence of Figure 15. 



Description of the Preferred Embodiments 
The present invention comprises isolated nucleic 
acids (polynucleotides) that encode polypeptides having 
fi-ketoacyl synthase activity. The novel polynucleotides 

10 and polypeptides of the invention are involved in the 
synthesis of very long chain fatty acids and are useful 
for modulating the total amounts of such fatty acids and 
the specific VLCFA profile in plants. 

A polynucleotide of the invention may be in the 

15 form of RNA or in the form of DNA, including cDNA, 

synthetic DNA or genomic DNA. The DNA may be double- 
stranded or single- stranded, and if single- stranded, can 
be either the coding strand or non-coding strand. An RNA 
analog may be, for example, mRNA or a combination of 

20 ribo- and deoxyribonucleotides . Illustrative examples of 
a polynucleotide of the invention are shown in Figs. 3, 
5, 7, 9, 11, 13 and 15. 

A polynucleotide of the invention typically is at 
least 15 nucleotides (or base pairs, bp) in length. In 

25 some embodiments, a polynucleotide is about 20 to 100 

nucleotides in length, or about 100 to 500 nucleotides in 
length. In other embodiments, a polynucleotide is 
greater than about 150 0 nucleotides in length and encodes 
a polypeptide having the amino acid sequence shown in 

30 Figs. 4, 6, 8, 10, 12, 14 or 16. 

In some embodiments, a polynucleotide of the 
invention encodes analogs or derivatives of a polypeptide 
having the deduced amino acid sequence of Figs. 4, 6, 8, 
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10, 12, 14 or 16. Such fragments, analogs on derivatives 
include, for example, naturally occurring allelic 
variants, non-naturally occurring allelic variants, 
deletion variants and insertion variants, that do not 
5 substantially alter the function of the polypeptide. 

A polynucleotide of the invention may further 
comprise additional nucleic acids. For example, a 
nucleic acid fragment encoding a secretory or leading 
amino acid sequence can be fused in- frame to the amino 

10 terminal end of one of the ELI through EL7 polypeptides. 
Other nucleic acid fragments are known in the art that 
encode amino acid sequences useful for fusing in- frame to 
the KAS polypeptides disclosed herein. See, e.g., U.S. 
5,629,193 incorporated herein by reference. A 

15 polynucleotide may further comprise one or more 
regulatory elements operably linked to a KAS 
polynucleotide disclosed herein. 

The present invention also comprises 
polynucleotides that hybridize to a KAS polynucleotide 

2 0 disclosed herein. Such a polynucleotide typically is at 

least 15 nucleotides in length. Hybridization typically 
involves Southern analysis (Southern blotting) , a method 
by which the presence of DNA sequences in a target 
nucleic acid mixture are identified by hybridization to a 

25 labeled oligonucleotide or DNA fragment probe. Southern 
analysis typically involves electrophoretic separation of 
DNA digests on agarose gels, denaturation of the DNA 
after electrophoretic separation, and transfer of the DNA 
to nitrocellulose, nylon, or another suitable membrane 

30 support for analysis with a radiolabeled, biotinylated, 
or enzyme-labeled probe as described in sections 9.37- 
9.52 of Sambrook et al . , (1989) Molecular Cloning, second 
edition, Cold Spring Harbor Laboratory, Plainview; NY. 

A polynucleotide can hybridize under moderate 

3 5 stringency conditions or, preferably, under high 
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stringency conditions to a KAS polynucleotide disclosed 
herein- High stringency conditions are used to identify 
nucleic acids that have a high degree of homology to the 
probe. High stringency conditions can include the use of 
5 low ionic strength and high temperature for washing, for 
example, 0.015 M NaCl/0.0015 M sodium citrate (0.1X SSC) ; 
0.1% sodium lauryl sulfate (SDS) at 65°C. Alternatively, 
a denaturing agent such as formamide can be employed 
during hybridization, e.g., 50% formamide with 0.1% 

10 bovine serum albumin/ 0.1% Ficol 1/0.1% 

polyvinylpyrrolidone/ 5 0 nnM sodium phosphate buffer at pH 
6.5 with 750 mM NaCl , 75 mM sodium citrate at 42 °C. 
Another example is the use of 50% formamide, 5 x SSC 
(0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium 

15 phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5 x 
Denhardt's solution, sonicated salmon sperm DNA (50 
/xg/ml) , 0.1% SDS, and 10% dextran sulfate at 42°C, with 
washes at 42 °C in 0.2 x SSC and 0.1% SDS. 

Moderate stringency conditions refers to 

20 hybridization conditions used to identify nucleic acids 
that have a lower degree of identity to the probe than do 
nucleic acids identified under high stringency 
conditions. Moderate stringency conditions can include 
the use of higher ionic strength and/or lower 

25 temperatures for washing of the hybridization membrane, 
compared to the ionic strength and temperatures used for 
high stringency hybridization. For example, a wash 
solution comprising 0.060 M NaCl/0.0060 M sodium citrate 
(4X SSC) and 0.1% sodium lauryl sulfate (SDS) can be used 

30 at 50°C, with a last wash in IX SSC, at 65°C. 

Alternatively, a hybridization wash in IX SSC at 37°C can 
be used. 

Hybridization can also be done by Northern 
analysis (Northern blotting) , a method used to identify 
3 5 RNAs that hybridize to a known probe such as an 
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oligonucleotide, DNA fragment, cDNA or fragment thereof, 
or RNA fragment. The probe is labeled with a 
radioisotope such as 32 P, by biotinylation or with an 
enzyme. The RNA to be analyzed can be usually 
5 electrophoretically separated on an agarose or 

polyacrylamide gel, transferred to nitrocellulose, nylon, 
or other suitable membrane, and hybridized with the 
probe, using standard techniques well known in the art 
such as those described in sections 7.3 9-7.52 of Sambrook 

10 et al . , supra. 

A polynucleotide has at least about 70% sequence 
identity, preferably at least about 80% sequence 
identity, more preferably at least about 9 0% sequence 
identity to SEQ ID NO : 1 , 3, 5, 7, 9, 11, or 13 . Sequence 

15 identity can be determined, for example, by computer 

programs designed to perform single and multiple sequence 
alignments . 

A polynucleotide of the invention can be obtained 
by chemical synthesis, isolation and cloning from plant 

2 0 genomic DNA or other means known to the art, including 
the use of PCR technology carried out using 
oligonucleotides corresponding to portions of SEQ ID 
NO:l, 3, 5, 7-9, 11 or 13. Polymerase chain reaction 
(PCR) refers to a procedure or technique in which target 

25 nucleic acid is amplified in a manner similar to that 
described in U.S. Patent No. 4,683,195, incorporated 
herein by reference, and subsequent modifications of the 
procedure described therein. Generally, sequence 
information from the ends of the region of interest or 

30 beyond is employed to design oligonucleotide primers that 
are identical or similar in sequence to opposite strands 
of the template to be amplified. PCR can be used to 
amplify specific RNA sequences, specific DNA sequences 
from total genomic DNA, and cDNA transcribed from total 

35 cellular RNA, bacteriophage or plasmid sequences, and the 
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like. Alternately, a cDNA library (in an expression 
vector) can be screened with KAS-specific antibody 
prepared using peptide sequence (s) from hydrophilic 
regions of the KAS protein of SEQ ID NO: 2 and technology 
5 known in the art . 

A polypeptide of the invention comprises an 
isolated polypeptide having the deduced amino acid 
sequence of Figs. 2, 4, 6, 8, 10 and 12, as well as 
derivatives and analogs thereof. By 11 isolated" is meant 

10 a polypeptide that is expressed and produced in an 
environment other than the environment in which the 
polypeptide is naturally expressed and produced. For 
example, a plant polypeptide is isolated when expressed 
and produced in bacteria or fungi. Similarly, a plant 

15 polypeptide is isolated when its gene coding sequence is 
operably linked to a chimeric regulatory element and 
expressed in a tissue where the polypeptide is not 
naturally expressed. A polypeptide of the invention also 
comprises variants of the KAS polypeptides disclosed 

2 0 herein, as discussed above. 

A full-length KAS coding sequence may comprise the 
sequence shown in SEQ ID N0:1, 3, 5, 7, 9, 11 or 13. 
Alternatively, a chimeric full-length KAS coding sequence 
may be formed by linking, in- frame, nucleotides from the 
25 5' region of a first KAS gene to nucleotides from the 3' 
region of a second KAS gene, thereby forming a chimeric 
KAS protein. 

It should be appreciated that nucleic acid 
fragments having a nucleotide sequence other than the KAS 

3 0 sequences disclosed in SEQ ID NO:l, 3, 5, 7, 9, 11 or 13 

will encode a polypeptide having the exemplified amino 
acid coding sequence of SEQ ID NO: 2, 4, 6, 8, 10, 12 or 
14, respectively. The degeneracy of the genetic code is 
well-known to the art; i.e., for many amino acids, there 
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is more than one nucleotide triplet which serves as the 
codon for the amino acid. 

It should also be appreciated that certain amino 
acid substitutions can be made in protein sequences 
5 without affecting the function of the protein. 

Generally, conservative amino acid substitutions or 
substitutions of similar amino acids are tolerated 
without affecting protein function. Similar amino acids 
can be those that are similar in size and/or charge 

10 properties, for example, aspartate and glutamate and 
isoleucine and valine are both pairs of similar amino 
acids. Similarity between amino acid pairs has been 
assessed in the art in a number of ways. For example, 
Dayhoff et al . (1978) in At2as of Protein Sequence and 

15 structure, Vol. 5, Suppl . 3, pp. 345-352, which is 
incorporated by reference herein, provides frequency 
tables for amino acid substitutions which can be employed 
as a measure of amino acid similarity. 

A nucleic acid construct of the invention 

2 0 comprises a polynucleotide as disclosed herein linked to 

another, different polynucleotide. For example, a full- 
length KAS coding sequence may be operably fused in- frame 
to a nucleic acid fragment that encodes a leader 
sequence, secretory sequence or other additional amino 
25 acid sequences that amy be usefully linked to a 
polypeptide or peptide fragment. 

A transgenic plant of the invention contains a 
nucleic acid construct as described herein. In some 
embodiments, a transgenic plant contains a nucleic acid 

3 0 construct that comprises a polynucleotide of the 

invention operably linked to at least one suitable 
regulatory sequence in sense orientation. Regulatory 
sequences typically do not themselves code for a gene 
product. Instead, regulatory sequences affect the 
3 5 expression level of the polynucleotide to which they are 
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linked. Examples of regulatory sequences are known in 
the art and include, without limitation, minimal 
promoters and promoters of genes preferentially or 
exclusively expressed in seeds or in epidermal cells of 
5 stems and leaves. Native regulatory sequences of the 
polynucleotides disclosed herein can be readily isolated 
by those skilled in the art and used in constructs of the 
invention. Other examples of suitable regulatory 
sequences include enhancers or enhancer-like elements, 

10 introns, 3' non- coding regions such as poly A sequences 
and other regulatory sequences discussed herein. 
Molecular biology techniques for preparing such chimeric 
genes are known in the art . 

In other embodiments, a transgenic plant contains 

15 a nucleic acid construct comprising a partial or a full- 
length KAS coding sequence operably linked to at least 
one suitable regulatory sequence in antisense 
orientation. The chimeric gene can be introduced into a 
plant and transgenic progeny displaying expression of the 

20 antisense construct are identified. 

One may use a polynucleotide disclosed herein for 
cosuppression as well as for antisense inhibition. 
Cosuppression of genes in plants may be achieved by 
expressing, in the sense orientation, the entire or 

25 partial coding sequence of a gene. See, e.g., WO 
04\11516, incorporated herein by reference. 

Transgenic techniques for use in the invention 
include, without limitation, Agrobacteri um-mediated 
transformation, viral vector-mediated transformation 

3 0 electroporation and particle gun transformation. 

Illustrative examples of transformation techniques are 
described in U.S. Patent 5,2 04,253, (particle gun) and 
U.S. Patent 5,188,958 (Agrobacterium) , incorporated 
herein by reference. Transformation methods utilizing 

35 the Ti and Ri plasmids of Agrobacterium spp. typically 
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use binary- type vectors. Walkerpeach, C. et al . , in 
Plant Molecular Biology Manual, S. Gelvin and R. 
Schilperoort , eds . , Kluwer Dordrecht, C1-.1-19 (1994). If 
cell or tissue cultures are used as the recipient tissue 
5 for transformation, plants can be regenerated from 

transformed cultures by techniques known to those skilled 
in the art . 

Techniques are known for the introduction of DNA 
into monocots as well as dicots, as are the techniques 

10 for culturing such plant tissues and regenerating those 
tissues- Monocots which have been successfully 
transformed and regenerated include wheat, corn, rye, 
rice, and asparagus. See, e.g., U.S. Patent Nos . 
5,484,956 and 5,550,318, incorporated herein by 

15 reference . 

For efficient production of transgenic plants from 
plant cells, it is desirable that the plant tissue used 
for transformation possess a high capacity for 
regeneration. Transgenic plants of woody species such as 

2 0 poplar and aspen have also been obtained. Technology is 

also available for the manipulation, transformation, and 
regeneration of gymnosperm plants. For example, U.S. 
Patent No. 5,122,466 describes the biolistic 
transformation of conifers, with preferred target tissue 
25 being meristematic and cotyledon and hypocotyl tissues. 

U.S. Patent No. 5,041,382 describes enrichment of conifer 
embryonal cells . 

Seeds produced by a transgenic plant (s) can be 
grown and then selfed (or outcrossed and selfed) to 

3 0 obtain seeds homozygous for the construct. Seeds can be 

analyzed in order to identify those homozygotes having 
the desired expression of the construct. Transgenic 
plants may be entered into a breeding program, e.g., to 
introgress the novel construct into other lines, to 
3 5 transfer the construct to other species, or for further 
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selection of other desirable traits. Alternatively, 
transgenic plants may be propagated vegetatively for 
those species amenable to such techniques. A nucleic 
acid construct of the invention can alter the levels of 
5 very long chain fatty acids in plant tissues expressing 
the polynucleotide, compared to VLCFA levels in 
corresponding tissues from an otherwise identical plant 
not expressing the polynucleotide. A comparison can be 
made, for example, between a non- transgenic plant of a 

10 plant line and a transgenic plant of the same plant line. 
Levels of VLCFAs having 20-32 carbons and/or levels of 
VLCFAs having 32-60 carbons can be altered in plants 
disclosed herein. Plants having an altered VLCFA 
composition may be identified by techniques known to the 

15 skilled artisan, e.g. , thin layer chromatography or gas- 
liquid chromatography (GLC) analysis of the appropriate 
plant tissue . 

A suitable group of plants with which to practice 
the invention are the Bra.ssica. species, including B . 

20 n&pus, B. rapa, B.juncea, and B. ftirta. Other suitable 
plants include, without limitation, soybean (Glycine 
max) , sunflower (Helianthus annuus) and corn (Zea mays) . 

A method according to the invention comprises 
introducing a nucleic acid construct into a plant cell 

2 5 and producing a plant (as well as progeny of such a 

plant) from the transformed cell. Progeny includes 
descendants of a particular plant or plant line, e.g., 
seeds developed on an instant plant are descendants. 
Progeny of an instant plant include seeds formed on F lt 

3 0 F 2 , F 3 , and subsequent generation plants, or seeds formed 

on BCj., BC 2 , BC 3 , and subsequent generation plants. 

Methods and compositions according to the 
invention are useful in that the resulting plants and 
plant lines have desirable alterations in very long chain 
35 fatty acid composition. Suitable tissues in which to 
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express polynucleotides and/or polypeptides of the 
invention include, without limitation, seeds, stems and 
leaves. Leaf tissues of interest include cells and 
tissues of the epidermis, e.g., cells that are involved 
5 in forming trichomes. Of particular interest are 

epidermal cells involved in forming the cuticular layer . 
The cuticular layer comprises various very long chain 
fatty acids and VLCFA derivatives such as alkanes, 
esters, alcohols and aldehydes. Altering the composition 

10 and amount of VLCFAs in epidermal cells and tissues may 
enhance defense mechanisms and drought tolerance of 
plants disclosed herein. 

Polynucleotides of the invention can be used as 
markers in plant genetic mapping and plant breeding 

15 programs. Such markers may include RFLP, RAPD, or PGR 
markers, for example. Marker-assisted breeding 
techniques may be used to identify and follow a desired 
fatty acid composition during the breeding process. 
Marker-assisted breeding techniques may be used in 

2 0 addition to, or as an alternative to, other sorts of 

identification techniques. An example of marker-assisted 
breeding is the use of PCR primers that specifically 
amplify a sequence from a desired KAS that has been 
introduced into a plant line and is being crossed into 
25 other plant lines. 

Plants and plant lines disclosed herein preferably 
have superior agronomic properties. Superior agronomic 
characteristics include, for example, increased seed 
germination percentage, increased seedling vigor, 

3 0 increased resistance to seedling fungal diseases (damping 

off, root rot and the like) , increased yield, and 
improved standability . 

While the invention is susceptible to various 
modifications and alternative forms, certain specific 
3 5 embodiments thereof are described in the general methods 
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and examples set forth below* It should be understood, 
however, that these examples are not intended to limit 
the invention to the particular forms disclosed but, 
instead the invention is to cover all modifications, 
5 equivalents and alternatives falling within the scope of 
the invention. 

EXAMPLES 
Example 1 

Cloning and Expression of FAE1 in Yeast Cells 

10 The open reading frame of the Arabidopsis FAE1 

gene was amplified directly by PCR, using Arabidopsis 
thaliana cv. Columbia genomic DNA as a template, pfu DNA 
polymerase and the following primers: 
5 ' CTCGAGGAGCAATGACGTCCGTTAA- 3 ' and 5 ' - 

15 CTCGAGTTAGGACCGACCGTTTTG-3 7 . The PCR product was blunt- 
end cloned into the Eco RV site of pBluescript 
(Stratagene, La Jolla, CA) , 

The FAE1 gene was excised from the Bluescript 
vector with BamHl, and then subcloned into the pYEUra3 

20 (Clontech, Palo Alto, CA) . pYEUra3 is a yeast 
centromere -containing, episomal plasmid that is 
propagated stably through cell division. The FAE1 gene 
was inserted downstream of a GAL1 promoter in pYEUra3 . 
The GAL1 promoter is induced when galactose is present in 

25 the medium and repressed when glucose is present in the 
growth medium. 

Insertion of the FAE1 gene in the sense 
orientation was confirmed by PCR, and pYEUra3 / FAE1 was 
used to transform Saccharomyces cerevisiae strain AB13 8 0 

30 using a lithium acetate procedure as described in Gietz, 
R. and Woods, R. , in Molecular Genetics of Yeast: 
Practical Approaches, Oxford Press, pp. 121-134 (1994) . 
Plasmid DNA was isolated from putative transf ormants , and 
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the presence of the FAEl/pYEUra3 construct was confirmed 
by Southern analysis. 

Yeast transformed with pYEUra3 having FAB1 
operably linked to the GAL1 promoter were grown in the 
5 presence of galactose or glucose and were analyzed for 
the expression of FAE1. As a control, yeast transformed 
with pYEUra3 containing no insert were also assayed. 
Analysis of such control preparations yielded fatty acid 
compositions and fatty acid elongation rates similar to 

10 those of yeast transformed with pYEUra3 / FAE1 and grown 
with glucose as the carbon source. 

The fatty acid composition of yeast cells grown in 
the presence of galactose was compared to that of cells 
grown in the presence of glucose , to determine if VLCFA 

15 were found in the galactose-induced cells . 

Transformed yeast cells were grown overnight in 
YPD media at 3 0°C with vigorous shaking. One hundred fil 
of the overnight culture were used to inoculate 4 0 ml of 
complete minimal uracil dropout media (CM-Ura) 

20 supplemented with either glucose or galactose (2% w/v) . 
Cultures were grown at 30 °C to an OD 600 of approximately 
1.3 to 1.5. Cells were harvested by centrif ugation at 
5000 Xg for 10 min. Total lipids were extracted from the 
cells with 2 volumes of 4N KOH in 100% methanol for 60 

25 min. at 80°C. Fatty acids were saponified and methyl 
esters were prepared by drying the samples and 
resuspending in 0.5 ml of boron trichloride in methanol 
(10% v/v) . Samples were incubated at 50°C for 15 min in 
a sealed tube. About 2 ml of water was then added and 

3 0 the fatty methyl esters were extracted thrice with 1 ml 
of hexane . Extracts were dried under nitrogen and 
redissolved in hexane. See Hlousek-Radoj cic , A. et al . , 
Plant J. 8:803-809. Methyl esters were analyzed on an HP 
5890 series II gas chromatograph equipped with a 5771MSD 

35 and 7673 auto injector (Hewlett-Packard, Cincinnati, OH) . 
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Methyl esters were separated on a DB-23 (J&W Scientific) 
capillary column (30 m X 0.2 5 mm X 0.2 5 fxva) . The column 
was operated with helium carrier gas and splitless 
injection (injection temperature 280 °C, detector 
5 temperature 280°C) . After an initial 3 min. at 100°C / 
the oven temperature was raised to 250° at 20°C min" 1 and 
maintained at that temperature for an additional 3 min. 
The identity of the peaks was verified by 
cochromatography with authentic standards and by mass 

10 spectrometer analysis. 

The results clearly revealed the appearance of 
both 20:1 and 22:1 acyl-CoA products in galactose- induced 
yeast containing the FAE1 coding sequence. Uninduced 
yeast cells failed to accumulated significant amounts of 

15 fatty acids longer than C18. These results indicate that 
expression of FAE1 in yeast resulted in functional KAS 
activity and functional elongase activity. 

Example 2 
FAE1 Activity in Yeast Microsomes 

20 The functional expression of the FAE1 KAS was 

analyzed by isolating microsomes from transformed yeast 
cells and assaying these microsomes in vitro for elongase 
activity. 

Transformed yeast cells were grown in the presence 
25 of either glucose or galactose (2% w/v) as described in 
Example 1. Cells were harvested by centrif ugation at 
5000 Xg for 10 min and washed with 10 ml ice cold 
isolation buffer (IB) , which contains 80 mM Hepes-KOH, pH 
7.2, 5 mM EGTA, 5 mM EDTA, 10 mM KC1 , 320 mM sucrose and 
3 0 2 mM DTT) . Cells were then resuspended in enough IB to 
fill a 1.7 ml tube containing 700 (il of 0.5 p glass 
beads and yeast microsomes were isolated from the cells 
essentially as described in Tillman, T. and Bell, R. , J. 
Biol. Chem. 261:9144-9149 (1986). The microsomal 
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membrane pellet was recovered by centrif ligation at 
252,000 xg for 60 min. The pellet was rinsed by 
resuspending in 4 0 ml fresh IB and again recovered by 
centrif ugation at 252000 Xg for 60 min. Microsomal 
5 pellets were resuspended in a minimal volume of IB, and 
the protein concentration adjusted to 2.5 /xg jLtl' 1 by 
addition of IB containing 15% glycerol. Microsomes were 
frozen on dry ice and stored at -80°C, The protein 
concentration in microsomes was determined by the 

10 Bradford method (Bradford, 1976) . 

Fatty acid elongase activity was measured 
essentially as described in Hlousek-Radoj cic , A. et al . , 
Plant J. 8:803-809 (1995). Briefly, the standard 
elongation reaction mix contained 8 0 mM Hepes-KOH, pH 

15 7.2, 20 mM MgCl 2 , 500 fiM NADPH, 1 mM ATP, 100 uM malonyl- 
CoA, 10 fiM CoA-SH and 15 fiM radioactive acyl-CoA 
substrate. The radiolabeled substrate was either [1~ 
14 C]18:l-CoA (50 uCi /xmol" 1 ) , [1" 14 C] 1 8 : 0 ~CoA (55 uCi fimal~ 
x ) , or [1" 14 C] 16 : 0-CoA (54 uCi /xmol" 1 ) . The reaction was 

2 0 initiated by the addition of yeast microsomes (5 fig 

protein) and the mixture incubated at 30° C for the 
indicated period of time. The final reaction volume was 
25 fil. 

Methyl esters of the acyl-CoA elongation products 
25 were prepared as described in Example 1. Methyl esters 
were separated on reversed phase silica gel KC18 TLC 
plates (Whatman, 250 uM thick) , quantified by 
phosphor imaging, and analyzed on by ImageQuant software 
(Molecular Dynamics, Inc., Sunnyvale, CA) . The detection 
30 limit for each product is about 0.001 nanomoles per min. 
per mg microsomal protein, depending on the phosphorimage 
exposure time ♦ 

Results of representative in vitro elongation 
assays are shown in Figs. 1 and 2. The results indicate 

3 5 that microsomes from galactose- induced cells expressing 
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FAE1 catalyzed multiple cycles of elongation starting 
with either C16:0 acyl CoA, C18:0 acyl CoA, or C18:l 
acyl-CoA as the substrate (Fig, 1). The 16:0 and 18:0 
acyl-CoA substrates were elongated to C26:0 acyl-CoA. In - 
5 contrast, the 18:1 -CoA substrate was elongated primarily 
to C20:l, with only low levels of C22:l acyl -CoA being 
produced. Occasionally, trace levels of C24:l CoA were 
also observed. Although the chain length of the products 
from the 18:1 acyl-CoA substrate were less than the chain 

10 length from the saturated acyl -CoA substrates, the rate 
of elongation of oleoyl-CoA was about 2- and 3 -fold 
higher than the rates of elongation of 16:0-CoA and 18:0- 
CoA, respectively . 

The elongation activity observed in microsomes 

15 from uninduced cells indicated that there was a low level 
of endogenous elongase activity when 18:l-CoA or 18:0-CoA 
were used as substrates. There was substantial 16:0-CoA 
elongase activity (10.1 nmpl mg protein" 1 at 30 min) in 
microsomes from uninduced cells (Fig. 2). However, the 

20 major product of 16:0 elongation using uninduced 

microsomes was C18:0 acyl CoA, with only small amounts of 
products beyond this length. The elongation of the 16:0 
acyl -CoA substrate presumably is due to an endogenous 
yeast elongase. 

25 Elongation of 18:1 CoA by microsomes from induced 

cells occurred at a rate about 18 -fold higher than in 
microsomes isolated from the uninduced cells (Fig. 2). 
With microsomes from induced yeast, synthesis of 20:0 CoA 
from the 16:0 CoA substrate, occurred at a rate similar 

3 0 to that seen when the substrate was 18:0 CoA (4.2 vs. 5.1 
nmol mg protein" 1 } . The total rate of elongation of [ 14 C] 
16:0-CoA by microsomes from induced cells (15.8 nmol mg 
protein" 1 at 30 min.) was more than 50% higher than 
elongation of [ 14 C] 16:0-CoA by microsomes from uninduced 

35 cells, suggesting that the FAB1 KAS utilized 16:0 -CoA as 
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a substrate in addition to C18-C24 acyl-CoAs. The FAE1 
elongase KAS activity, i.e., the difference in the 16:0 
elongation rates between microsomes from induced and 
uninduced cells, was 5.7 nmol mg protein -1 . The 
5 elongation rate with the 16:0 substrate thus was similar 
to the elongase activity of the FAE1 elongase KAS with 
the 18:0 substrate. 

These results indicate that FAE1 KAS expressed in 
yeast could synthesize 3 -ketoacyl -CoA in vitro and, in 
10 combination with yeast reductases and dehydrases, could 
form a functional VLCFA elongase complex. In addition, 
these results suggest that FAE1 is membrane -bound in 
yeast cells. 



Example 3 

15 Cloning and Sequencing of Arabidopsis Elongase Genes 

The sequence of a jojoba seed cDNA (see WO 
93/10241 and WO 95/15387, incorporated herein by 
reference) was used to search the Arabidopsis expressed 
sequence tag (EST) database of the Arabidopsis Genome 

2 0 Stock Center (The Ohio State University, Columbus, Ohio) . 
The BLAST computer program (National Institutes of 
Health, Bethesda, MD, USA) was used to perform the 
search. The search identified two ESTs (ATTS12 82 and 
ATTS3218) that had a high degree of sequence identity 

25 with the jojoba sequence. The ATTS1282 and ATTS3218 ESTs 
appeared to be partial cDNA clones rather than full- 
length clones based on the length of the jojoba sequence. 

A genomic DNA library from Arabidopsis thaLliana 
cv. Columbia, was prepared in the lambda GEM11 vector 

30 (Promega, Madison, Wisconsin) and was obtained from Ron 
Davis, Stanford University, Stanford, CA. The library 
was hybridized with ATTS1282 and ATTS3218 as probes and 2 
clones were identified for each EST. Phage DNA was 
isolated from each of the hybridizing clones, the genomic 
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insert was excised with the restriction enzyme Sac I and 
subcloned into the plasmid pBluescript (Stratagene, La 
Jolla, CA) . One clone from the ATTS12 82 hybridization 
was designated ELI and one clone from the ATTS3218 
5 hybridization was designated EL2 . 

A yeast expression library, containing cDNA from 
Arabidopsis thaliana cv. Columbia, was prepared in the 
lambda YES expression vector described in Elledge et al . 
(Elledge, S. et al . , Proc. Natl. Acad. Sci USA 88:1731- 

10 1735 (1991) and was obtained from Ron Davis at Stanford 
University, Stanford, CA. The library was hybridized 
with a EL2 partial cDNA probe. A full-length EL2 cDNA 
was not identified. However, the probe did identify a 
full-length cDNA which was designated EL3 . 

15 A consensus sequence for the C-terminal region of 

ELI, EL2 and the jojoba cDNA polypeptides was identified 
by sequence alignment using DNA analysis programs from 
DNAStar, Madison, Wisconsin. This consensus sequence was 
used to search the Arabidopsis EST database again for fi- 

20 keto acyl synthase sequences. These searches identified 
four additional putative S-keto acyl synthase ESTs, which 
were designated EL4 through EL7 . EL4 , EL5, EL6, and EL7 
have homology to Genbank Accession Nos , T04345, T4493 9, 
T22193 and T76700, respectively. 

25 The lambda YES cDNA expression library described 

above was hybridized with the ELI and EL4-EL7 ESTs as 
probes. This screen identified full-length cDNAs for 
ELI, ELS and EL6 . 

The lambda GEM11 genomic library was hybridized 

3 0 with the EL4 and EL7 ESTs as probes. This screen 

identified full-length genomic clones for EL4 and EL7 . 
Phage DNA was isolated from each of the hybridizing 
clones and subcloned into pBluescript as described above. 
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The 7 EL clones were sequenced on both strands 
with regions of overlap for each sequence run. 
Sequencing was carried out with an ABI automated 
sequencer (Applied Biosystems, Inc., Foster City, 
5 California), following the manufacturer's instructions. 

The nucleotide sequences for the coding regions of 
EL1-EL7 are shown in Figs. 3, 5, 7, 9, 11, 13 and 15, 
respectively. The deduced amino acid sequences for ELI - 
EL 7 are shown in Figs. 4, 6, 8, 10, 12, 14 and 16, 

10 respectively, using the standard one-letter amino acid 
code. The ELI, EL2 and EL7 genomic clones appeared to 
lack introns. The EL4 genomic clone contained one intron 
near the 5' end of the coding region. 

The nucleotide sequences of the 7 EL 

15 polynucleotides were compared to 5 DNA sequences present 
in Genbank. Genbank, National Center for Biotechnology 
Information, Bethesda, MD. Two of the 5 accessions were 
cloned from members of the Brassicaceae : the Arabidopsis 
FAE1 sequence (Accession U29142) and a Brassica napus 

2 0 sequence (Accession U50771) ♦ Three of the accessions 
were cloned from jojoba (Sirwnondsia chinensis) : 2 wax 
biosynthesis genes (Accessions 114084 and 114085) and a 
jojoba KAS gene (Accession U37088) . See also U.S. Patent 
5,44 5,94 7, incorporated herein by reference. 

2 5 Multiple alignment of the 12 sequences was carried 

out with a computer program sold under the trade name 
MEGALIGN Lasergene by DNAStar (Madison, Wisconsin) . 
Alignments were done using the Clustal method with 
weighted residue weight table. The nucleotide sequence 

3 0 similarity index and percent divergence based on the 

multiple alignment algorithm is shown in Table 1. The 
nucleotide sequences of EL1-EL7 are distinguishable from 
the 5 DNA sequences obtained from Genbank. 

The deduced amino acid sequences of the ELI -7 
3 5 polypeptides were compared with the MEGALIGN program to 
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the deduced amino acid sequences of the same 5 Genbank 
clones, using the Clustal method with PAM2 50 residue 
weight table. The amino acid sequence similarity and 
percent divergence are shown in Table 2 . The amino acid 
5 sequences of EL1-EL7 polypeptides are distinguishable 
from those of the Genbank sequences. 
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Example 4 
Expression of ELI and EL2 in Yeast 

The open reading frames (ORFs) for the EL2 , EL4 
and EL 7 clones were amplified by PCR. The EL2 ORF was 
cloned into XYES using the primers: 

CTCGAGCAAGTCCACTACCACGCA and CTCGAGCGAGTCAGAAGGAACAAA. 
The EL4 ORF was cloned into pYEUra3 using the primers ; 
GATAATTTAGAGAGGCACAGGGT and GTCGACACAAGAATGGGTAGATCCAA . 
The EL7 ORF was cloned into pYEUra3 using the primers: 
CAGTTCCTCAAACGAAGCTA and GTCGACTTCTCAATGGACGGTGCCGGA . 
Amplified products were cloned into pYEUra3 under the 
control of, and 3' to, the GAL1 promoter. The resulting 
plasmids were transformed into yeast as described in 
Example 1. 

Yeast cultures containing full-length ELI in XYES 
and full-length EL2 in pYEUra3 were grown in the presence 
of galactose or glucose as described in Example 2 . 
Microsomes were then prepared from each of the cultures 
and fatty acid elongation assays were carried out as 
described in Example 2 . 

In the first experiment, microsomes were prepared 
from galactose-induced cultures of ELI, EL2 and FAE1, and 
incubated with either [1- 14 C] 18:0 acyl-CoA or [1- 14 C] 18:1 
acyl-CoA as substrate. The amounts of various reaction 
products synthesized after 3 0 minutes (min) were 
determined as described in Example 2. The results when 
18 : 0 acyl-CoA was the substrate are shown in Table 3 . 
The results when 18:1 acyl-CoA was the substrate are 
shown in Table 4 . 
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Table 3. 

Elongation of 18*0-CoA by FAE1, ELI and EL2 Genes 
Expressed in Yeast 





£-Keto Acyl Synthase Gene 


Acyl - CoA 
Product 


Fi 




ELI 


EL2 


Rate 1 


(%) 


Rate 


(%) 


Rate 


(%) 


20 : 0 


0.369 


64 .3 


0.084 


38 . 8 


0 .108 


41 . 8 


22 : 0 


0.113 


18 .6 


0 . 047 


21.9 


0 .053 


20.7 


24 :0 


0 . 065 


10 .7 


0.043 


19. 9 


0.052 


20.3 


26:0 


0.038 


6.3 


0 .042 


19 .4 


0 .044 


17 .2 


Total 


0.585 


100.0 


0.216 


100 . 0 


0.258 


100. 0 



1 Nanomole3 /minute /mg of microsomal protein 



Table 4. 

Elongation of 18xl-CoA by FAE1, ELI and EL2 Genes 
Expressed in Yeast 





S-Keto Acyl Synthase Gene 


Acyl -CoA 
Product 


FAE1 


ELI 


EL2 


Rate 1 




Rate 


(%) 


Rate 


(%) 


20:1 


1.131 


84 .6 


0 . Ill 


80 . 8 


0 .091 


84 .1 


22:1 


0.206 


15.4 


0.026 


19.2 


0 .017 


15.9 


24 : 1 


0 . 0 


0.0 


0.0 


0.0 


0.0 


0.0 


26 :1 


0.0 


0.0 


0.0 


0 . 0 


0.0 


0.0 


Total 


1.337 


100 . 0 


0.137 


100 . 0 


0 .108 


100 .0 



Nanomoles /minute /mg of microsomal protein 



The results shown in Tables 3 and 4 indicate that 
the ELI and EL2 gene products have E-ketoacyl synthase 
(KAS) activity and that the KAS reaction product can be 
utilized to form VLCFAs . The specific activities of the 
3 KAS enzymes cannot be compared, since the relative 
amount of the heterologous KAS protein in each microsomal 
preparation is not known. However, the proportions of 
various reaction products can be compared between FAE1 , 
ELI and EL2 . 

The data shown in Table 3 indicate that the ELI 
and EL2 KAS activities result in a higher proportion of 
saturated VLCFAs than does the FAE1 KAS activity. These 
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results suggest that ELI and EL2 encode novel gene 
products, because ELI and EL2 have a greater preference 
for C22:0 and C24 : 0 acyl-CoA substrates than does FAE1 . 

A comparison of the relative elongation activity 
of FAE1 with 18:0 and 18:1 substrates (Tables 3 and 4) 
indicates that FAE1 is more active when 18:1 is the 
substrate than when 18:0 is the substrate. In contrast, 
the overall rate of product formation with ELI is less 
when 18:1 is the substrate than when 18:0 is the 
substrate (Tables 3 and 4) . EL2 is also less active when 
18:1 is the substrate than when 18:0 is the substrate 
(Tables 3 and 4) . These results support the conclusion 
that ELI and EL2 encode novel gene products and suggest 
that ELI and EL2 have a preference for saturated fatty 
acids as substrates, whereas the FAE1 gene product has a 
preference for monounsaturated fatty acids as substrates ♦ 

In a second experiment, microsomes were prepared 
from galactose-induced and from glucose-repressed yeast 
cultures containing ELI or EL2 coding sequences. The 
microsomal preparations were incubated with either 18:0 
acyl-CoA or 18:1 acyl-CoA as substrate and the fatty acid 
reaction products determined as described above. The 
results with the 18:0 substrate are shown in Table 5. 
The results with the 18:1 substrate are shown in Table 6. 
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Table 5. 

Elongation of 18:0-CoA by ELI and EL2 
With and Without Induction of Gene Expression 





fi-Keto Acyl Synthase Gene 


Acyl 

COA 


E 


LI 


EL2 


+Glu 


cose 


+Ga lactose 


+Glucose 


+Galactose 


Rate 1 


(%) 


Rate 


(%) 


Rate 


(%) 


Rate 


(%) 


20t0 


0.007 


100 .0 


0. 074 


55. 8 


0 .03 0 


81.3 


0 .107 


43 - 1 


22 ;0 


0.000 


0.0 


0.023 


17.4 


0.002 


5 . 1 


0.044 


17 . 8 


24 : 0 


0.000 


0.0 


0.020 


15.3 


0.005 


13.6 


0.04B 


19.1 


26 :0 


0.000 


0.0 


0.015 


11 . 5 


0.000 


0 . 0 


0 .050 


20 . 0 


Total 


0.007 


100 . 0 


0.133 


100. 0 


0 .037 


100 .0 


0.249 


100.0 



A Nanomolea/minute/mg of. microsomal protein 



Table 6 . 

Elongation of 18:l-CoA by ELI and EL2 
With and Without Induction of Gene Expression 





15-Keto Acyl 


Synthase Gene 


Acyl 
CoA 


E 


LI 


EL2 


+Glu 


:oae 


♦Galactose 


♦Glucose 


♦Galactose 


Rate 1 


(%) 


Rate 


<%> 


Rate 


<%> 


Rate 


(%> 


20 : 1 


0. 062 


100.0 


0 .081 


100 . 0 


0 .043 


100.0 


0.089 


100.0 


22 : 1 


0. 000 


0.0 


0.000 


0.0 


0.000 


0.0 


0. 000 


0.0 


24 : 1 


0. 000 


0 . 0 


0,000 


0 . 0 


0-000 


0.0 


0, 000 


0.0 


26 :1 


0.000 


0 . 0 


0.000 


0.0 


0.000 


0.0 


0 . 000 


0.0 


Total 


0.062 


100.0 


0.081 


100.0 


0.043 


100, 0 


0.089 


100. 0 



Nanomoles/minute/mg of microsomal protein 



The results in Table 5 show in vitro elongase 
activity for ELI and EL2 under induced (galactose) and 
uninduced (glucose) conditions. The comparison indicates 
that induction with galactose results in a large increase 
in overall elongase activity when 18:0 acyl CoA is the 
substrate (about 19 -fold and 7- fold for ELI and EL2, 
respectively). In contrast, induction when 18:1 acyl CoA 
is the substrate results in only a small increase in 
elongase activity (about 1.3-fold and 2 -fold for ELI and 
E12, respectively), as shown in Table 6. 

The results in Table 5 show that little or no 
VLCFA products are made by yeast microsomes under 
uninduced conditions. Upon induction of ELI and EL2 gene 
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expression, however, significant quantities of C20:0, 
022:0, C24:0 and C26:0 are made. The data in Tables 5 
and G are consistent with the results in Tables 3 and 4, 
which indicated that ELI and EL2 were more active with a 
saturated fatty acid substrate than with a 
monounsaturated substrate . 

The data in Tables 5 and 6 are also consistent 
with the data in Tables 3 and 4 indicating that the ELI 
and EL2 gene products are more active in converting C24 : 0 
to C26:0 than is FAE1. 

In a third experiment, microsomes from induced and 
uninduced cultures containing ELI or EL2 were incubated 
in the absence of cofactors involved in the S-ketoacyl 
condensation reaction. Cultures were induced and 
microsomes were prepared as described in Example 2 . In 
vitro assays were carried out as described in Example 2, 
except that either ATP, CoASH or both were omitted from 
the enzyme reaction mixture. In addition, one reaction 
was carried out in a complete mixture having 0 . 01 mM of 
cerulenin (Sigma, St. Louis, MO). Cerulenin is an 
inhibitor of some condensing enzymes. The results are 
shown in Tables 7-9. 
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Table 7. 

Effect of Cof actors on 18:0-CoA Elongation 1 



Gene 


Expt 4 


+Glu 2 


+Gal 2 


-ATP 3 


-CoA 3 


-A&C 3 


+ Cer 3 


ELI 


1 


. 037 


. 109 


.095 


. 105 


.119 


. 141 


2 


N.D. 


. 090 


.125 


. 093 


.270 


. 176 


EL2 


1 


. 033 


. 112 


.168 


. 127 


.143 


.238 


2 


N.D. 


. 120 


.178 


.133 


.195 


.302 



1 Activity in nanomoles/minute/mg of microsomal protein. 

2 +Glu; microsomes from cultures grown in the presence of glucose and 
incubated in standard reaction mix; +Gal : microsomes from cultures grown 
in the presence of galactose and incubated in standard reaction mix, 

3 Microsomes from galactose-induced cultures. -ATP: ATP omitted from 
reaction mix; -CoA; Coenzyme A omitted from reaction mix; -A&C: ATP and 
Coenzyme A omitted from reaction mix; +Cer: Standard reaction mix 
containing 0.01 mM cerulenin. 

4 Experiment No . 
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Table 8 . 

Effect of Cofactors on Elongation Products of ELI 1 



Prod . 


+Glu 2 


+Gal 2 


-ATP 3 


-CoA 3 


-A&C 3 


+Cer 3 


20 : 0 


53.9 


46.2 


34 .4 


47 .8 


41.7 


46 . 7 


22 : 0 


14 .4 


18.7 


13 . 7 


18 .0 


19.4 


16 .2 


24 : 0 


18.5 


18 .1 


20.6 


19 .1 


16.7 


17 .7 


26 : 0 


13 .2 


17 . 1 


31.4 


15 .2 


22 .3 


19.4 


Total 


100 . 0 


100 . 0 


100. 0 


100.0 


100 . 0 


100 . 0 



1 Amount of indicated product as a percent of total products formed. 
Results of one experiment for +Glucose; Average of two experiments for 
other conditions. 

2 +Glu: microsomes from cultures grown in the presence of glucose and 
incubated in standard reaction mix; +Gal : microsomes from cultures grown 
in the presence of galactose and incubated in standard reaction mix. 

3 Microsomes from galactose- induced cultures. -ATP: ATP omitted from 
reaction mix; -CoA: Coenzyme A omitted from reaction mix; -A&C : ATP and 
Coenzyme A omitted from reaction mix; +Cer : Standard reaction mix 
containing 0.01 mM cerulenin. 
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Table 9. 

Effect of Cofactors on Elongation Products of EL2 1 



Prod. 


-fGlu 2 


+Gal 2 


-ATP 3 


-CoA 3 


-A&C 3 


+Cer 3 


20:0 


54 .5 


47 . 1 


34 . 1 


45.3 


38 . 0 


41 . 8 


22 :0 


17. 1 


19.1 


16 .4 


19.2 


15 . 9 


16 . 1 


24 :0 


5 . 8 


19.4 


20 . 8 


19. 9 


18 .4 


20 . 4 


26: 0 


22.6 


14 .5 


28 . 9 


15 . 8 


27 * 8 


21 . 8 


Total 


100 . 0 


100 . 0 


100 . 0 


100 . 0 


100 . 0 


100 . 0 



1 Amount of indicated product as a percent of total products formed. 
Results of one experiment for +Glucose; Average of two experiments for 
other conditions . 

2 +Glu: microsomes from cultures grown in the presence of glucose and 
incubated in standard reaction mix; +Gal : microsomes from cultures grown 
in the presence of galactose and incubated in standard reaction mix. 

3 Microsomes from galactose- induced cultures. -ATP: ATP omitted from 
reaction mix; -CoA: Coenzyme A omitted from reaction mix; -A&C : ATP and 
Coenzyme A omitted from reaction mix; +Cer : Standard reaction mix 
containing 0.01 mM cerulenin. 

The results in Table 7 indicate that omission of 
ATP and/or CoA from the incubation mixture does not have 
a significant effect on the overall amounts of VLCFAs 
synthesized by the in vitro KAS activity of ELI or EL2 . 
The results also show that cerulenin does not inhibit the 
KAS activity of ELI or EL2 . The data in Table 8 and 9 
confirm that ELI and EL2 KAS activity produces 
significant amounts of C24:0 and C26:0 acyl CoA products. 

To the extent not already indicated, it will be 
understood by those of ordinary skill in the art that any 
one of the various specific embodiments herein described 
and illustrated may be further modified to incorporate 
features shown in other of the specific embodiments. 

The foregoing detailed description has been 
provided for a better understanding of the invention only 
and no unnecessary limitation should be understood 
therefrom as some modifications will be apparent to those 
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skilled in the art without deviating from the spirit and 
scope of the appended claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 
(i) APPLICANT: CARGILL , INCORPORATED 
(ii) TITLE OF THE INVENTION: FATTY ACID ELONGASES 



(iii) NUMBER OF SEQUENCES: 14 

Civ) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Richardson P.C., P. A. 

(B) STREET: 60 South Sixth Street, Suite 3300 

(C) CITY: Minneapolis 

(D) STATE: MN 

(E) COUNTRY: USA 

(F) ZIP: 55402 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA : 

(A) APPLICATION NUMBER: 08/868,373 

(B) FILING DATE: 03-JUN-1997 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Lundquist, Ronald C 

(B) REGISTRATION NUMBER: 37,875 

(C) REFERENCE/DOCKET NUMBER: 07039/064WO1 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 612-335-5050 

(B) TELEFAX: 612-288-9696 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 156 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATGGATCGAG AGAGATTAAC GGCGGAGATG GCGTTTCGAG ATTCATCATC GGCCGTTATA 6 0 

AGAATTCGAA GACGTTTGCC GGATTTATTA ACGTCCGTTA AGCTCAAATA CGTGAAGCTT 120 

GGACTT C AC A ACTCTTGCAA CGTGACCACC ATTCTCTTCT TCTTAATTAT TCTTCCTTTA 180 

ACCGGAACCG TGCTGGTTCA GCTAACCGGT CTAACGTTCG ATACGTTCTC TGAGCTTTGG 24 0 

TCTAACCAGG CGGTTCAACT CGACACGGCG ACGAGACTTA CCTGCTTGGT TTTCCTCTCC 300 

TTCGTTTTGA CCCTCTACGT GGCTAACCGG TCTAAACCGG TTTACCTAGT GGATTTCTCC 3 60 

TGCTACAAAC CGGAAGACGA GCGTAAAATA TCAGTAGATT CGTTCTTGAC GATGACTGAG 42 0 

GAAAATGGAT CATTCACCGA TGACACGGTT CAGTTCCAGC AAAGAATCTC GAACCGGGCC 4 80 
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GGTTTGGGAG ACGAGACGTA TCTGCCACGT GGCATAACTT CAACGCCCCC 
ATGTCAGAGG CACGTGCCGA AGCTGAAGCC GTTATGTTTG GAGCCTTAGA 
GAGAAAACCG GAATTAAACC GGCCGAAGTC GGAATCTTGA TAGTAAACTG 
AATCCGACGC CGTCTCTATC AGCGATGATC GTGAACCATT ACAAGATGAG 
AAAAGTTACA ACCTCGGAGG AATGGGTTGC TCCGCCGGAT TAATCTCAAT 
AACAATCTCC T C AAAGC AAA CCCTAATTCT TACGCTGTCG TGGTAAGCAC 
ACCCTAAACT GGTACTTCGG AAATG AC CGG TCAATGCTCC TCTGCAACTG 
ATGGGCGGAG CTGCGATTCT CCTCTCTAAC CGCCGTCAAG ACCGGAAGAA 
TCGCTGGTCA ACGTCGTTCG AACACATAAA GGATCAGACG ACAAGAACTA 
T AC C AGAAGG AAGACGAGAG AGGAACAATC GGTGTCTCTT TAGCTAGAGA 
GTCGCCGGAG ACGCTCTGAA AACAAACATC ACGACTTTAG GACCGATGGT 
TCAGAGCAGT TGATGTTCTT GATTTCCTTG GTCAAAAGGA AGATGTTCAA 
AAACCGTATA TTCCGGATTT CAAGCTAGCT TTCGAGCATT TCTGTATTCA 
AGAGCGGTTC TAGACGAAGT GCAGAAGAAT CTTGATCTCA AAGATTGGCA 
TCTAGAATGA CTTTGCACAG ATTTGGTAAC ACTTCGAGTA GCTCGCTTTG 
GCTTATACCG AAGCTAAGGG TCGGGTTAAA GCTGGTGACC GACTTTGGCA 
GGATCGGGTT TCAAGTGTAA TAGTGCGGTT TGGAAAGCGT TACGACCGGT 
GAGATGACCG GTAATGCTTG GGCTGGTTCG ATTGATCAAT ATCCGGTTAA 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 0 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 



Met 


Asp 


Arg 


Glu 


Arg 


Leu 


Thr 


Ala 


Glu 


Met 


Ala 


Phe 


Arg 


Asp 


Ser 


Ser 


1 






5 










10 










15 




Ser 


Ala 


Val 


He 


Arg 


He 


Arg 


Arg 


Arg 


Leu 


Pro 


Asp 


Leu 


Leu 


Thr 


Ser 








20 










25 










30 






Val 


Lys 


Leu 


Lys 


Tyr 


Val 


Lys 


Leu 


Gly 


Leu 


His 


Asn 


Ser 


Cys 


Asn 


Val 




35 










40 










45 








Thr 


Thr 


He 


Leu 


Phe 


Phe 


Leu 


He 


He 


Leu 


Pro 


Leu 


Thr 


Gly 


Thr 


Val 




50 










55 










60 










Leu 


Val 


Gin 


Leu 


Thr Gly 


Leu 


Thr 


Phe 


Asp 


Thr 


Phe 


Ser 


Glu 


Leu 


Trp 


65 










70 










75 










80 


Ser 


Asn 


Gin 


Ala 


Val 


Gin 


Leu 


Asp 


Thr 


Ala 


Thr 


Arg 


Leu 


Thr 


Cys 


Leu 










85 










90 










95 




Val 


Phe 


Leu 


Ser 


Phe 


Val 


Leu 


Thr 


Leu 


Tyr 


Val 


Ala 


Asn 


Arg 


Ser 


Lys 








100 










105 










110 






Pro 


Val 


Tyr 


Leu 


Val 


Asp 


Phe 


Ser 


Cys 


Tyr 


Lys 


Pro 


Glu 


Asp 


Glu 


Arg 






115 










120 










125 








Lys 


He 


Ser 


Val 


Asp 


Ser 


Phe 


Leu 


Thr 


Met 


Thr 


Glu 


Glu 


Asn 


Gly 


Ser 


130 








135 










140 










Phe 


Thr 


Asp 


Asp 


Thr 


Val 


Gin 


Phe 


Gin 


Gin Arg 


He 


Ser 


Asn 


Arg 


Ala 


145 






150 










155 










160 


Gly 


Leu 


Gly 


Asp 


Glu 


Thr 


Tyr 


Leu 


Pro 


Arg 


Gly 


He 


Thr 


Ser 


Thr 


Pro 




165 










170 










175 




Pro 


Lys 


Leu 


Asn 


Met 


Ser 


Glu 


Ala 


Arg 


Ala 


Glu 


Ala 


Glu 


Ala 


Val 


Met 






180 










185 










190 






Phe 


Gly 


Ala 


Leu 


Asp 


Ser 


Leu 


Phe 


Glu 


Lys 


Thr 


Gly 


He 


Lys 


Pro 


Ala 




195 








200 










205 








Glu 


Val 


Gly 


He 


Leu 


He 


Val 


Asn 


Cys 


Ser 


Leu 


Phe 


Asn 


Pro 


Thr 


Pro 




210 








215 










220 










Ser 


Leu 


Ser 


Ala 


Met 


He 


Val 


Asn 


His 


Tyr 


Lys 


Met 


Arg 


Glu 


Asp 


He 


225 










230 










235 










240 


Lys 


Ser 


Tyr 


Asn 


Leu Gly 


Gly 


Met 


Gly 


Cys 


Ser 


Ala 


Gly 


Leu 


He 


Ser 










245 










250 










255 




He 


Asp 


Leu 


Ala 


Asn 


Asn 


Leu 


Leu 


Lys 


Ala 


Asn 


Pro 


Asn 


Ser 


Tyr 


Ala 






260 










265 










270 






Val 


Val 


Val 


Ser 


Thr 


Glu 


Asn 


He 


Thr 


Leu 


Asn 


Trp 


Tyr 


Phe 


Gly 


Asn 



275 280 285 



/~ ' 7\ 7\ (~* (~* H"! 7V 7V T 




u 


X J-V J L-^,j.l_JL X V_ 


D u 


A 

u 


O Zl CI P T"T* 21 T T H 


O D 


n 




*7 *? 


n 
u 


r 1 rz 3i T 1 h t r* rz pt 

CLxM. X X Cot— X 


*7 P 
/ O 


n 
u 




o *± 


n 


U/^x LX 1 L.L-vji/\ 


y u 


n 
u 


r*rnr*i j\ 7\ 7\ m 7\ Z"" 1 




U 


LAAI ibUbib 


in*? 


n" 
U 


GCTCATGTCT 


108 


0 


TCTTCCATTG 


114 


0 


GTTAAAAGTT 


120 


0 


CGCAGGAGGT 


126 


0 


CATGGAACCT 


132 


0 


GTATGAGATG 


138 


0 


GATTGCGTTT 


144 


0 


TTCGACGGAG 


150 


0 


AGTTGTGCAA 


156 


0 
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Asp 


Arcr 


Ser 


Met 


Leu 


Leu 


Cvs 


Asn 


Cvs 


He 


Phe 


Arg 


Met 


Glv 


Glv 


Ala 




290 










295 










300 










Ala 


lie 


Leu 


Leu 


Ser 


Asn 


Arg 


Arg 


Gin 


Asp 


Arg 


Lys 


Lys 


Ser 


Lys 




305 










310 










315 










320 


Ser 


Leu 


Val 


Asn 


Val 


Val 


Arq 


Thr 


His 


Lvs 


Gly 


Ser 


Asp 


Asp 


Lys 


Asn 










325 










330 










335 




Tvr 


Asn 


Cys 


Val 


Tvr 


Gin 


Lvs 


Glu 


Asp 


Glu 


Arg Gly 


Thr 


He 


Gly 


Val 








340 










345 










350 






Ser 


Leu 


Ala 


Arg 


Glu 


Leu 


Met 


Ser 


Val 


Ala 


Gly Asp Ala 


Leu 


Lys 


Thr 






355 










360 










3 O 3 






Asn 


lie 


Thr 


Thr 


Leu 


Gly 


Pro 


Met 


Val 


Leu 


Pro 


Leu 


Ser 


Glu 


Gin 


Leu 




370 










375 










380 










Met 


Phe 


Leu 


lie 


Ser 


Leu 


Val 


Lvs 
-f 


Arg 


Lvs 


Met 


Phe 


Lys 


Leu 


Lys 


Val 


385 










390 










395 










400 


Lys 


Pro 


Tvr 


lie 


Pro 


Asp 


Phe 


Lys 


Leu 


Ala 


Phe 


Glu 


His 


Phe 


Cys 


He 










405 










410 










415 




His 


Ala 


Glv 


Glv 


Arg 


Ala 


Val 


Leu 


Asp 


Glu 


Val 


Gin 


Lys 


Asn 


Leu 


Asp 








420 










425 










430 




Leu 


Lys 


Asp 




His 


Met 


Glu 


Pro 


Ser 


Arg 


Met 


Thr 


Leu 


His 




Phe 






435 










440 










445 








Gly Asn 


Thr 


Ser 


Ser 


Ser 


Ser 


Leu 


Trp 


Tvr 


Glu 


Met 


Ala 


Tyr 


Thr 


Glu 




450 










455 










460 








Ala 


Lys 


Gly 


Arg 


Val 


Lys 


Ala 


Gly 


Asp 


Arg 


Leu 


Trp 


Gin 


He 


Ala 


Phe 


465 










470 










475 










480 


Gly 


Ser 


Gly 


Phe 


Lys 


Cys 


Asn 


Ser 


Ala 


Val 


Trp 


Lys 


Ala 


Leu 


Arg 


Pro 










485 










490 










495 




Val 


Ser 


Thr 


Glu 


Glu 


Met 


Thr 


Gly 


Asn 


Ala 


Trp 


Ala 


Gly 


Ser 


He 


Asp 








500 










505 










510 






Gin 


Tyr 


Pro 


Val 


Lys 


Val 


Val 


Gin 



















515 520 



(2) INFORMATION FOR SEQ ID NO : 3 ; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 147 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

ATGGATTACC CCATGAAGAA GGTAAAAATC TTTTTCAACT ACCTCATGGC GC AT CGCTTC 60 

AAGCTCTGCT TCTTACCATT AATGGTTGCT AT AGC CGTGG AGGCGTCTCG TCTTTCCACA 12 0 

CAAGATCTCC AAAACTTTTA CCTCTACTTA C AAAAC AAC C ACACATCTCT AACCATGTTC 180 

TTCCTTTACC TCGCTCTCGG GTCGACTCTT TACCTCATGA CCCGGCCCAA ACCCGTTTAT 24 0 

CTCGTTGACT TTAGCTGCTA CCTCCCACCG TCGCATCTCA AAGCCAGCAC CCAGAGGATC 300 

ATGCAACACG TAAGGCTTGT ACGAGAAGCA GGCGCGTGGA AGCAAGAGTC CGATTACTTG 360 

ATGGACTTCT GCGAGAAGAT TCTAGAACGT TCCGGTCTAG GCCAAGAGAC GTACGTACCC 42 0 

GAAGGTCTTC AAACTTTGCC AC TAC AAC AG AATTTGGCTG TATCACGTAT AGAGACGGAG 480 

GAAGTTATTA TTGGTGCGGT CGATAATCTG TTTCGCAACA CGGGAATAAG CCCTAGTGAT 54 0 

ATAGGTATAT TGGTGGTGAA TTCAAGCACT TTTAATCCAA CACCTTCGCT ATCAAGTATC 600 

TTAGTGAATA AGTTTAAACT TAGGGATAAT ATAAAGAGCT TGAATCTTGG TGGGATGGGG 66 0 

TGTAGCGCTG GAGTCATCGC TAT CGATGCG GCTAAGAGCT TGTTACAAGT TCATAGAAAC 72 0 

ACTTATGCTC TTGTGGTGAG CACGGAGAAC ATCACTCAAA AC TT GTAC AT GGGTAACAAC 7 80 

AAATCAATGT TGGTTACAAA CTGTTTGTTC CGTATAGGTG GGGCCGCGAT TTTGCTTTCT 84 0 

AACCGGTCTA TAGATCGTAA ACGCGCAAAA TACGAGCTTG TTCACACCGT GCGGGTCCAT 900 

ACCGGAGCAG ATGACCGATC CTATGAATGT GCAACTCAAG AAGAGGATGA AGATGGCATA 960 

GTTGGGGTTT CCTTGTCAAA GAATCTACCA ATGGTAGCTG CAAGAACCCT AAAGATCAAT 102 0 

ATCGCAACTT TGGGTCCGCT TGTTCTTCCC ATAAGCGAGA AGTTTCACTT CTTTGTGAGG 10 B0 

TTCGTTAAAA AGAAGTTTCT CAAC C CCAAG CTAAAGCATT ACATTCCGGA TTTCAAGCTC 114 0 

GCATTCGAGC ATTTCTGTAT CCATGCGGGT GGTAGAGCGC TAATTGATGA GATGGAGAAG 12 0 0 

AATCTTCATC TAACTCCACT AGACGTTGAG GCTTCAAGAA TGACATTACA CAGGTTTGGT 126 0 

AATACCTCTT CGAGCTCCAT TTGGTACGAG TTGGCTTACA CAGAAGCCAA AGGAAGGATG 132 0 

ACGAAAGGAG ATAGGATTTG GCAGATTGCG TTGGGGTCAG GTTTTAAGTG TAATAGTTCA 13 8 0 
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GTTTGGGTGG CTCTTCGTAA CGTCAAGCCT TCTACTAATA ATCCTTGGGA ACAGTGTCTA 144 0 
C AC AAATAT C CAGTTGAGAT CGATATAGAT TTAAAAGAG 14 79 

(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 493 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 



Met 


Asp 


Tvr 


Pro 


Met 


Lys 




Val 


Lys 


lie 


Phe 


Phe 


Asn 


Tyr 


Leu 


Met 


1 








5 










1 0 

X V 








15 




Ala 


His 


Arg 


Phe 
20 


Lvs 


Leu 


Cys 


Phe 


Leu 

o c: 


Pro 


Leu 


Met 


Val 


A "1 A 

*3 n 


He 


Ala 


Val 


Glu 


Ala 


Ser 


Arg 


Leu 


Ser 


Thr 


Gin 


Asp 


Leu 


will 


AiSll 


±r xie 


Tyr 


Leu 






35 










40 










A ^ 






Tvr 


Leu 


Gin 


Asn 


Asn 


His 


Thr 


Ser 


Leu 


Thr 


Met 


Phe 


Phe 


_ 

LiGU 


Tyr 


Leu 




50 










55 










60 








Ala 


Leu 


Gly 


Ser 


Thr 


Leu 


Tvr 


Leu 


Met 


Thr 




Pro 


Lys 


Pro 


Val 


Tyr 


65 










70 










75 










80 


Leu 


Val 


Asp 


Phe 


Ser 


Cvs 


Tvr 


Leu 


Pro 


Pro 


Ser 


His 


Leu 


uy t> 


Ala 


Ser 










85 










Oft 








95 




Thr 


Gin 


Arg 


He 


Met 


Gin 


His 


Val 


A TCI 


Leu 


Val 


nj -y 


Glu 


Ala 


Gly Ala 








100 










X \J z> 










XX U 






TrD 


Lys 


Gin 


Glu 


Ser 


Asp 


Tvr 


Leu 


Met 


Asp 


Phe 


Cys 


Glu 


Lys 


He 


Leu 






115 










12 0 










125 






Glu 


Arg 


Ser 


Gly 


Leu 


Gly 


Gin 


Glu 


Thr 


xyx 


V C*. X 


JTJL \J 


Glu 


Gly 


Leu 


Gin 




130 










135 










1 A n 
x*± u 








Thr 


Leu 


Pro 


Leu 


Gin 


Gin 


Asn 


Leu 


Ala 


Val 


Ser 


Arg 


He 


Glu 


Thr 


Glu 


145 










150 










X _> 








160 


Glu 


Val 


He 


He 


Gly 
165 


Ala 


Val 


Asp 


Asn 


Leu 
170 


Phe 


Arg 


Asn 


Thr 


Gly 
175 


He 


Ser 


Pro 


Ser 


Asp 
180 


He 


Glv 


He 


Leu 


Val 
185 


Val 


Asn 


Ser 


Ser 


Thr 
190 


Phe 


Asn 


Pro 


Thr 


Pro 


Ser 


Leu 


Ser 


Ser 


He 


Leu 


Val 


Asn 


Lys 


Phe 


Lys 


Leu 


Arg 






195 










200 










205 




Asp 


Asn 


He 


Lys 


Ser 


Leu 


Asn 


Leu 


Gly 


Gly 


Met 


Gly 


Cys 


Ser 


Ala 


Gly 




210 










215 










220 






Val 


He 


Ala 


He 


Asp 


Ala 


Ala 


Lys 


Ser 


Leu 


Leu 


Gin 


Val 


His 


Arg 


Asn 


225 










230 










235 








240 


Thr 


Tyr 


Ala 


Leu 


Val 


Val 


Ser 


Thr 


Glu 


Asn 


He 


Thr 


Gin 


Asn 


Leu 


Tyr 










245 










250 










255 


Met 


Gly Asn Asn 


Lys 


Ser 


Met 


Leu 


Val 


Thr 


Asn 


Cys 


Leu 


Phe 


Arg 


He 








260 










265 










270 




Gly 


Gly Ala Ala 


He 


Leu 


Leu 


Ser 


Asn 


Arg 


Ser 


He 


Asp 


Arg 


Lys 


Arg 






275 










280 










285 






Ala 


Lys 


Tyr 


Glu 


Leu 


Val 


His 


Thr 


Val 


Arg 


Val 


His 


Thr 


Gly 


Ala 


As P 




290 










295 










300 








Asp 


Arg 


Ser 


Tyr 


Glu 


Cys 


Ala 


Thr 


Gin 


Glu 


Glu 


Asp 


Glu Asp 


Gly 


He 


305 










310 










315 








320 


Val 


Gly Val 


Ser 


Leu 


Ser 


Lys 


Asn 


Leu 


Pro 


Met 


Val 


Ala Ala Arg 


Thr 










325 










330 










335 




Leu 


Lys 


He 


Asn 
340 


He 


Ala 


Thr 


Leu 


Gly 
345 


Pro 


Leu 


Val 


Leu 


Pro 
350 


He 


Ser 


Glu 


Lys 


Phe 


His 


Phe 


Phe 


Val 


Arg 


Phe 


Val 


Lys 


Lys 


Lys 


Phe 


Leu 


Asn 






355 










360 








365 








Pro 


Lys 


Leu 


Lys 


His 


Tyr 


He 


Pro 


Asp 


Phe 


Lys 


Leu 


Ala 


Phe 


Glu 


His 




370 










375 








380 










Phe 


Cys 


He 


His 


Ala 


Gly 


Gly 


Arg 


Ala 


Leu 


He 


Asp 


Glu 


Met 


Glu 


Lys 


385 










390 










395 








400 


Asn 


Leu 


His 


Leu 


Thr 


Pro 


Leu 


Asp 


Val 


Glu 


Ala 


Ser 


Arg 


Met 


Thr 


Leu 



405 410 415 
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His 


Arg 


Phe 


Gly 








420 


Tyr 


Thr 


Glu 


Ala 






435 




lie 


Ala 


Leu 


Gly 




450 






Leu 


Arg 


Asn 


Val 


465 








His 


Lys 


Tyr 


Pro 



Asn Thr Ser Ser 

Lys Gly Arg Met 
440 

Ser Gly Phe Lys 
455 

Lys Pro Ser Thr 
470 

Val Glu lie Asp 
485 



Ser 


Ser 


He 


Trp 


425 








Thr 


Lys 


Gly 


Asp 


Cys 


Asn 


Ser 


Ser 








460 


Asn 


Asn 


Pro 


Trp 






475 




He 


Asp 


Leu 


Lys 




490 







Tyr Glu Leu Ala 
430 

Arg He Trp Gin 
445 

Val Trp Val Ala 

Glu Gin Cys Leu 
480 

Glu 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1512 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



CTACGTCAGG GTAGAACAAA GAGTAAACAC - TTAAGCAAAA CAATTTGTCC TACT CTTAGG 6 0 

TTATCTCCAA TGAAGAACTT AAAGATGGTT TTCTTCAAGA TCCTCTTTAT CTCTTTAATG 12 0 

GCAGGATTAG CCATGAAAGG ATCTAAGATC AACGTAGAAG ATCTCCAAAA GTTCTCCCTC 180 

CACCATACAC AGAACAACCT C CAAACC AT A AGCCTTCTAT TGTTTCTTGT CGTTTTTGTG 24 0 

TGGATCCTCT ACATGTTAAC CCGACCTAAA CCCGTTTACC TTGTTGATTT CTCCTGCTAC 300 

CTTCCACCGT CGCATCTCAA GGTCAGTATC CAAACCCTAA TGGGACACGC AAGACGTGCA 360 

AGAGAAGCAG GCATGTGTTG GAAGAACAAA GAGAGCGAC C ATTTAGT TGA CTTCCAGGAG 42 0 

AAGATTCTTG AACGTTCCGG TCTTGGTCAA GAAAC CT AC A TCCCCGAGGG TCTTCAGTGC 480 

TTCCCACTTC AGCAAGGCAT GGGTGCTTCA CGTAAAGAGA CGGAAGAAGT AATCTTCGGA 54 0 

GCTCTTGACA ATCTTTTTCG CAACACCGGT GTAAAACCTG ATGATATCGG TATATTGGTG 600 

GTGAATT CT A GCACGTTTAA TCCAACTCCA TCACTCGCCT CCATGATTGT GAACAAGTAC 66 0 

AAACTCAGAG AC AAC AT C AA GAGTTTGAAT CTTGGAGGGA TGGGTTGCAG TGCCGGAGTT 72 0 

ATAGCTGTTG ATGTCGCTAA GGGATTACTA CAAGTTCATA GGAACACTTA TGCTATTGTA 780 

GTAAGCACAG AGAACATCAC TCAGAACTTA TACTTGGGGA AAAACAAATC AATG CTAGTC 84 0 

ACAAACTGTT TGTTCCGCGT TGGTGGTGCT GCGGTTCTGC TTTCAAACAG ATCTAGAGAC 900 

CGTAACCGCG CCAAATACGA GCTTGTTCAC ACCGTACGGA TCCATACCGG ATCAGATGAT 960 

AGGTCGTTCG AATGTGCGAC ACAAGAAGAG GATGAAGATG GTATAATTGG AGTTACCTTG 102 0 

ACAAAGAATC TACCTATGGT GGCTGCAAGG ACTCTTAAGA TAAATATCGC AACTTTGGGT 1080 

CCTCTTGTAC TTCCATTAAA AGAGAAGCTA GCCTTCTTTA TTACTTTTGT CAAGAAGAAG 114 0 

TATTTCAAGC CAGAGTTAAG GAATTATACA CCAGATTTCA AGCTTGCCTT TGAGCATTTC 12 0 0 

TGTATCCACG CTGGTGGAAG AGCTCTAATA GATGAGCTGG AGAAGAAC CT TAAGCTTTCT 1260 

CCGTTACACG TAGAGGCGTC AAGAATGACA CTACACAGGT TTGGTAACAC TTCTTCTAGC 132 0 

TCAATCTGGT ACGAGTTAGC TTATACAGAA GCTAAAGGAA GGATGAAGGA AGGAGATAGG 13 8 0 

ATTTGG C AG A TTGCTTTGGG GTCAGGTTTT AAGTGTAACA GTTCAGTATG GGTGGCTCTG 144 0 

CGAGACGTTA AGCCTTCAGC TAACAGTC C A TGGGAAG AC T GTATGGATAG ATATCCGGTT 150 0 

GAGATTGATA TT 1512 



(2) INFORMATION FOR SEQ ID NO : 6 : 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 504 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Leu Arg Gin Gly Arg Thr Lys Ser Lys His Leu Ser Lys Thr lie Cys 

15 10 15 

Pro Thr Leu Arg Leu Ser Pro Met Lys Asn Leu Lys Met Val Phe Phe 
20 25 30 
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Lys 


lie 


Leu 


Phe 


He 


Ser 


Leu 


1 iC L. 


Al ^ 






35 










40 




Lys 


lie 


Asn 


Val 


Glu 


Asp 


Leu 


Gin 


Xjys 




50 










55 




Asn 


Asn 


Leu 


Gin 


Thr 


He 


Ser 


Leu 


Leu 


65 










70 








J-xp 


lie 


Leu 


xyx 


Met 


Leu 


Thr 


AT'fT 

-t-ix y 


Pro 










85 










Phe 


Ser 


Cys 


xy x 


Leu 


Pro 


Pro 


Qp-v- 
OCJ. 


XIX s 








100 










1 flR 






al v 


XIX kD 




y 


^ix y 


Al ^3 


Arg 






115 










X jj> \J 






T i^7C! 

jjy o 


ai n 

OX LI 


OCX 


nop 


Ills 


XjC IX 


VAX 


Asp 




13 0 










X ~J 3 






OCI 


uiy 


Jueu 


rn vr 




m n 

VJl Li 


IXlX 


Tyr 


145 










150 










T 




vjXXJ. 


vixy 




fll ir 


Til - 

Ax a 










165 








V c±x 


x x c; 




X y 


ril ci 


jueu 




_ 

ash 


Leu 








180 










1 DC 
-LOO 


Piro 


Asp 


Asp 


He 


Glv 

vjxy 


He 


Leu 


Val 


Val 






195 










20 0 




Thr 


Pro 


Ser 


Leu 


Ala 


Ser 


Met 


Tip 
X X c 


val 




210 










215 






Asn 


He 


Lys 


Ser 


Leu 


Asn 


UC Li 


f2l xr 


HI xr 


225 










23 0 






lie 


Ala 


Val 




V?9 1 
V Ct X 


riX Ct 


T 

Lys 


vjrxy 


Leu 










245 










Tyr 


Ala 


He 


Val 


Val 


Ser 


Thr 


ai n 


Asn 








260 










265 


Gly Lys 


Asn 


Lys 


Ser 


Met 


Leu 


v Ct X 


TVrr 

lllX 






275 










2 80 




Gly Ala 


Ala 


Val 


Leu 


Leu 


Ser 


Asn 


Arg 




290 










295 




Lys 


Tyr 


Glu 


Leu 


Val 


His 


Thr 


Val 


Arg 


305 










310 






Arg 


Ser 


Phe 


Glu 


Cys 


Ala 


Thr 


Gin 


Glu 










325 










Gly Val 


Thr 


Leu 


Thr 


Lys 


Asn 


Leu 


Pro 








340 










345 


Lys 


He 


Asn 


He 


Ala 


Thr 


Leu 


ai v 

vjx y 








355 










360 




Lys 


Leu 


Ala 


Phe 


Phe 


X -L W 


x in 


IT J.J.C 


V ctX 




370 










375 






Glu 


Leu 


Arg 


Asn 


±y x 


Thr 






XT XI C 


385 










390 








Cys 


He 


His 


Ala 


al \r 


vjxy 


— 

Arg 


AX cl 


Leu 










4 05 










Leu 


Lys 


Leu 


Ser 


Pro 


Leu 


His 


T7a 1 
V ctx 


r^i ii 








420 










425 


Arg 


Phe 


Gly Asn 


Thr 


Ser 


Ser 


Ser 


Ser 






435 










440 




Thr 


Glu 


Ala 


Lys 


Gly 


Arg 


Met 


Lys 


Glu 




450 










455 






Ala 


Leu 


Gly 


Ser 


Gly 


Phe 


Lys 


Cys 


Asn 


465 










470 








Arg 


Asp 


Val 


Lys 


Pro 


Ser 


Ala 


Asn 


Ser 










485 










Arg 


Tyr 


Pro 


Val 


Glu 


He 


Asp 


He 





500 



PCT7US98/I1384 



42 - 



Gly 


Leu 


21 1 a 




Lys 


bly 


Ser 








45 








Phe 


OCX 


i-ieu 


Xll O 


Xll s 


Tnr 


bin 


















IT lie? 


Leu 


XT~> "1 

Val 


vai 


"DVi ei 

Jt^ne 


vai 




75 










ft o 


Lys 


Pro 


V ClX 


iyr 


Leu 


vax 


Asp 


90 












Leu 


Lys 


val 


Q Q "V 
OCX 


T 1 <a 

ne 


bin 


Thr 










lid 

1 1 u 






al n 


Al A 
<rll d 


al xr 


l ¥ lcL 


Cys 


Trp 


Lys 


















nil t-i 


(Xl 11 


Lys 


116 


Leu 


C?lU 






14 0 










Tic, 

lie 


pro 




oxy 


Leu 


Gin 


Cys 




155 










loU 


OCI 


Arg 


Lys 


ulU 


Txir 


blU 


CjtlU 


X / \J 










1 / D 








ribll 


lllx 


nl v 
Laiy 


vai 


Lys 










X u 






Asn 


Ser 


OCJ. 


X 111 


P"h*a 
Jrlie 


Asn 


Pro 








205 








Asn 


xjy s 


iyr 


Lys 


Leu 


Arg 


Asp 






22 0 












nl 17" 


Cys 


£>er 


Aia 


Cjtiy 


vai 




235 










& ft u 


T 


al rx 

^3 111 


Val 


Xll S 


Arg 


Asn 


Tnr 


250 








o c: c 




Tip 
X X t3 


x I1X 


al n 
win 


Asn 


Leu 


Tyr 


Leu 
















Asn 


Pirn 


lieu 


PVl<=l 

Jrlie 


Arg 


vax 










0 P. K 








Ser 


"X y 


ir 


Arg 


ash 


Arg 


al => 






3 00 










He 


His 


■L 111 


nil 


OCX 


ASp 


Asp 




315 










7 1 n 


Glu 


Asp 


Glu 


Asp 


Gly 


TT * 
lie 


Tip 
11c 


330 










J j j 




Met 


Val 


Al Fi 
AX ct 


A 1 a 


^ y 


1 111 


Leu 
















XlC. LI 


V Ct X 


- 

Leu 


XT X C-J 


Leu 


Lys 


r*i n 
LjIU 








365 






T 

Jjys 


T 

xjy s 


Lys 


Tyr 


jrne 


Lys 


Pro 






*2 q n 

o o u 










Lys 


Leu 


Aia 


irne 


LjxU 


HIS 


File 




395 










a n ri 
^dt u u 


He 


Asp 


al ii 


T .All 

Xjt; tl 


al n 


Lys 


Asn 


410 










415 




Ala 


Ser 


Arg 


Met 


Thr 


Leu 


His 










430 






He 


Trp 


Tyr 


Glu 


Leu 


Ala 


Tyr 








445 








Gly Asp 


Arg 


He 


Trp 


Gin 


He 






460 










Ser 


Ser 


Val 


Trp 


Val 


Ala 


Leu 




475 










480 


Pro 


Trp 


Glu 


Asp 


Cys 


Met 


Asp 



490 495 
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(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1650 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

AT GGGTAGAT C C AACGAGC A AGATCTGCTC TCTACCGAGA TCGTTAATCG TGGGATCGAA 6 0 

CCATCCGGTC CTAACGCCGG CTCACCAACG TTCTCGGTTA GGGTCAGGAG ACGTTTGCCT 12 0 

GATTTTCTTC AGTCGGTGAA CTTGAAGTAC GTGAAACTTG GTTACCACTA CCTCATAAAC 18 0 

CATGCGGTTT ATTTGGCGAC CATACCGGTT CTTGTGCTGG TTTTTAGTGC TGAGGTTGGG 24 0 

AGTTTAAGCA GAGAAGAGAT TTGGAAGAAG CTTTGGGACT ATGATCTTGC AACTGTTATC 300 

GGATTCTTCG GTGTCTTTGT TTTAACCGCT TGTGTCTACT TCATGTCTCG TCCTCGCTCT 3 60 

GTTTATCTTA TTGATTTCGC TTGTTACAAG CCCTCCGATG AACACAAGGT GACAAAAGAA 42 0 

GAGTTCATAG AACTAGCGAG AAAATCAGGG AAGTTCGACG AAGAGACACT CGGTTTCAAG 4 80 

AAGAGGATCT TACAAGCCTC AGGCATAGGC GACGAGACAT ACGTCCCAAG ATCCATCTCT 54 0 

TCATCAGAAA ACATAACAAC GATGAAAGAA GGTCGTGAAG AAGCCTCTAC AGTGATCTTT 6 00 

GGAGCACTAG ACGAACTCTT CGAGAAGACA CGTGTAAAAC CTAAAGACGT TGGTGTCCTT 660 

GTGGTTAACT GTAGCATTTT CAACCCGACA CCGTCGTTGT CCGCAATGGT GATAAACCAT 72 0 

TACAAGATGA GAGGGAACAT ACTTAGTTAC AACCTTGGAG GGATGGGATG TTCGGCTGGA 780 

ATCATAGCTA TTGATCTTGC TCGTGACATG CTTCAGTCTA AC C CTAATAG TTATGCTGTT 84 0 

GTTGTGAGTA CTGAGATGGT TGGGTATAAT TGGTACGTGG GAAGTGACAA GTCAATGGTT 900 

ATACCTAATT GTTTCTTTAG GATGGGTTGT TCTGCCGTTA TGCTCTCTAA CCGTCGTCGT 96 0 

GACTTTCGCC ATGCTAAGTA CCGTCTCGAG CACATTGTCC GAACTCATAA GGCTGCTGAC 102 0 

GACCGTAGCT TCAGGAGTGT GTACCAGGAA GAAGATGAAC AAGGATTCAA GGGGTTGAAG 10 8 0 

ATAAGTAGAG ACTTAATGGA AGTTGGAGGT GAAGCTCTCA AGACAAACAT CACTACCTTA 114 0 

GGTCCTCTTG TCCTACCTTT CTCCGAGCAG CTTCTCTTCT TTGCTGCTTT GGTCCGCCGA 12 0 0 

ACATTCTCAC CTGCTGCCAA AACGTCCACA ACCACTTCCT TCTCTACTTC CGCCACCGCA 1260 

AAAACCAATG GAATCAAGTC TTCCTCTTCC GATCTGTCCA AG C CAT AC AT CCCGGACTAC 132 0 

AAGCTCGCCT TCGAGCATTT TTGCTTCCAC GCGGCAAGCA AAGTAGTGCT TGAAGAGCTT 13 8 0 

CAAAAGAATC TAGGCTTGAG TGAAGAGAAT ATGGAGGCTT CTAGGATGAC ACTTCACAGG 144 0 

TTTGGAAACA CTTCTAGCAG TGGAATCTGG TATGAGTTGG CTTACATGGA GGCCAAGGAA 150 0 

AGTGTTCGTA GAGGCGATAG GGTTTGGCAG ATCGCTTTCG GTTCTGGTTT TAAGTGTAAC 1560 

AGTGTGGTGT GGAAGGCAAT GAGGAAGGTG AAGAAGCCAA CCAGGAACAA TCCTTGGGTG 162 0 

GATTGCATCA ACCGTTACCC TGTGCCTCTC 1650 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 550 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 



Met 


Gly Arg 


Ser 


Asn 


Glu 


Gin 


Asp 


Leu 


Leu 


Ser 


Thr 


Glu 


He 


Val 


Asn 


1 








5 










10 










15 




Arg 


Gly 


lie 


Glu 
20 


Pro 


Ser 


Gly 


Pro 


Asn 
25 


Ala 


Gly 


Ser 


Pro 


Thr 
30 


Phe 


Ser 


Val 


Arg 


Val 
35 


Arg 


Arg 


Arg 


Leu 


Pro 
40 


Asp 


Phe 


Leu 


Gin 


Ser 
45 


Val 


Asn 


Leu 


Lys 


Tyr 
50 


Val 


Lys 


Leu 


Gly 


Tyr 
55 


His 


Tyr 


Leu 


He 


Asn 
60 


His 


Ala 


Val 


Tyr 


Leu 


Ala 


Thr 


He 


Pro 


Val 


Leu 


Val 


Leu 


Val 


Phe 


Ser 


Ala 


Glu 


Val 


Gly 


65 










70 










75 










80 


Ser 


Leu 


Ser 


Arg 


Glu 
85 


Glu 


He 


Trp 


Lys 


Lys 
90 


Leu 


Trp 


Asp 


Tyr 


Asp 
95 


Leu 


Ala 


Thr 


Val 


He 
100 


Gly 


Phe 


Phe 


Gly 


Val 
105 


Phe 


Val 


Leu 


Thr 


Ala 
110 


Cys 


Val 
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xy x 


Phe 


Met 


OC-i- 


Arg 


Pro 


Arg 


Ser 


Val 


xyx 


XJC LI 


xxc 


Asp 


Jrixc 


AT ^ 
nla 


Cys 






115 










120 










-IOC 
-LA Z> 






Tvr 


Lys 
130 


Pro 


Ser 


Asp 


Glu 


His 
135 


Lys 


Val 


Thr 


Lys 


ri~\ ,, 
"L40 

x*z \j 




xrXic 


xxc 


blU 


Leu 


Ala 


Arg 


Lys 


Ser 


Gly 


Lys 


Phe 


Asp 


Glu 


Glu 


X IXX 


Leu 


vjiy 


IT XXC 


Lys 


145 










150 










X -J> -J 










xou 


Lys 


Arg 


lie 


Leu 


Gin 


Ala 


Ser 


Gly 


lie 


Gly 


Zv en 


VJX LI 


Xxix 


Tyr 


vai 


Pro 










165 










170 








X / 3 




A.Y*Cf 


Seir 




OCX 


Ser 


Ser 


Glu 


Asn 


Tip 
X xc 


X XXX 


XI1X 


Mp|- 


Lys 


blU 


Caiy 


Arg 








180 










185 








190 


ox la 


\J J- IX 


A1 ^ 


OCX. 


XXIX 


v ctx 


X X c 


Ir IXC 


Laiy 


Ala 
rild 


Leu 


Asp 


CalU 


Leu 


Phe 


Glu 
















ri fi 

4& u u 








*^ r\ c 










XXI XT 


Arg 


vai 


Lys 


riO 


T 

Lys 


Asp 


vai 


pl lr 

biy 


vax 


Leu 


val 


Val 


Asn 


Cys 




210 










X 3 


















V 

DC! 


XXC 


xrXie 


Asn 


Pro 


xnxv 


Pro 


Ser 


Leu 


Ser 


i\i a 


Met 


val 


He 


Asn 


His 


225 










230 










*w .3 3 










Z 4 U 


xyx 






A "r*n 

.rtX y 


oJ.y 


A eST"i 


Tip 
X xc 


T 

LiSU 


OCX 


xyr 


Asn 


Leu 


pi 

tariy 


uiy 


Met 


Gly 










245 




















0 c c 


_y to 




Al a 


r*"i «, r 

uiy 
9 ^ n 


Tip 

X xc 


Tip 
X XC 


AT ;=) 


Tip 
X XC 


Asp 


Leu 


J\±cL 


Arg 


Asp 


Met 

2, 70 


Leu 


Gin 


Ser 


Asri 


Pro 


Asn 


Q p -x- 
OCX 


xyr 


Al 33 


V etx 


V cix 


val 


Ser 


xnr 


bill 


lyiet: 


Val 


Gly 






275 










2 80 










Z O 3 






lyr 




irp 


_ 

lyr 


V C±X 


uiy 


_ 

oer 


Asp 


Lys 


Ser 


Met 


Val 


He 


Pro 


Asn 


Cys 




290 




















300 








Xr lie 




*** y 




\jxy 


y O 


OCX 


Ala 


vax 


Ma f- 

l v lcC 


Leu 


Ser 


Asn 


Arg 


Arg 


Arg 


305 










310 










315 








32 0 


Asp 


Phe 


y 


His 


Ala 


xi _y o 


xyr 


A ~r~n 
rii y 


Leu 


vjtX IX 


His 


He 


vai 


Arg 


Tnr 


TT ' — 

HIS 










325 










330 








335 




Ly s 


Ala 


Ala 


Asp 


Asp 


Arg 


Ser 


Phe 


A vn 
.rax y 


OCX 


Val 


Tyr 


Pin 


nl n 


bill 


Asp 








340 










345 










35 0 




Glu 


Gin 


Gly 


Phe 


Lys 


Gly 


Leu 


Lys 


Tie 

X xc 


C p -v~ 

OCX 


Arg Asp 


Leu 


Mpf 
i v lCL 


bill 


vai 






355 










360 










365 








Glv 


Glv 


Glu 


Ala 


Leu 


Lys 


Thr 


Asn 


He 


Thr 


Thr 


Leu 




rlU 


Leu 


vai 




370 










375 










380 








Leu 


Pro 


Phe 


Ser 


Glu 


Gin 


Leu 


Leu 


CTIIC 


XT IIC 


Ala 


Ala 


_ 

Leu 


Veil 


Arg 


Arg 


385 










390 










395 








a fi 
ft u u 


Thr 


Phe 


Ser 


Pro 


Ala 
405 


Ala 


Lys 


Thr 


Ser 


Thr 
410 


Thr 


Thr 


ocx 


Xr HC 


OCX 

415 


Xxix 


Ser 


Ala 


Thr 


Ala 


Lys 


Thr 


Asn 


Gly 


X xc 


uy o 


Ser 


Ser 


OCX 


OCX 


Asp 


Leu 








420 










425 










430 




O ^ X 


xjy & 


Pro 

ZT d- <J 


i y r 


Tip 

X xc 


It X > 


^-it> LJ 


lyr 


Lys 


Leu 


Ala 


Phe 


ulU 


HIS 


j^ne 


Cys 






435 










44 0 










*± ■** o 






Phe 


fix & 




Al s 




T 

lys 


VcLX 


Val 


Leu 


uJLU 


Glu 


Leu 


bin 


Lys 


Asn 


Leu 




450 










A^^ 










460 










XjC Ll 


OCX 


VjjX Li 


bXU 


Asn 


J. v lc T, 


blU 


7v "1 


Ser 


Arg 


Met 


ixir 


Leu 


His 


Arg 


465 










470 










475 










A. Q A 


Phe 


Gly 


Asn 


Thr 


Ser 
485 


Ser 


Ser 


Gly 


He 


Trp 
490 


Tyr 


Glu 


Leu 


Ala 


Tyr 
495 


Met 


Glu 


Ala 


Lys 


Glu 
500 


Ser 


Val 


Arg 


Arg 


Gly 
505 


Asp 


Arg 


Val 


Trp 


Gin 
510 


He 


Ala 


Phe 


Gly 


Ser 


Gly 


Phe 


Lys 


Cys 


Asn 


Ser 


Val 


Val 


Trp 


Lys 


Ala 


Met 


Arg 






515 










520 










525 






Lys 


Val 


Lys 


Lys 


Pro 


Thr 


Arg 


Asn 


Asn 


Pro 


Trp 


Val 


Asp 


Cys 


He 


Asn 




530 










535 










540 






Arg 


Tyr 


Pro 


Val 


Pro 


Leu 























545 550 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1611 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

TCGAGCTACG TCAGGGCTTT TATATGCACA AATTCTCATA AAGTTTTCAA TTTTATTCCA 60 

TTTTTCTCGG AAGCCATGGA AGCTGCTAAT GAGCCTGTTA ATGGCGGATC CGTACAGATC 12 0 

CGAACAGAGA ACAACGAAAG ACGAAAGCTT CCTAATTTCT TACAAAGCGT CAACATGAAA 180 

TACGTCAAGC TAGGTTATCA TTACCTCATT ACTCATCTCT TCAAGCTCTG TTTGGTTCCA 24 0 

TTAATGGCGG TTTTAGTCAC AGAGATCTCT CGATTAACAA CAGACGATCT TTACCAGATT 300 

TGGCTTCATC TCCAATACAA TCTCGTTGCT TTCATCTTTC TCTCTGCTTT AGCTATCTTT 360 

GGCTCCACCG TTTACATCAT GAGTCGTCCC AGATCTGTTT ATCTCGTTGA TTACTCTTGT 42 0 

TATCTTCCTC CGGAGAGTCT TCAGGTTAAG TATCAGAAGT TTATGGATCA TTCTAAGTTG 4 80 

ATTGAAGATT TCAATGAGTC ATCTTTAGAG TTTCAGAGGA AGATTCTTGA ACGTTCTGGT 54 0 

TTAGGAGAAG AGACTTATCT CCCTGAAGCT TTACATTGTA TCCCTCCGAG GCCTACGATG 60 0 

ATGGCGGCTC GTGAGGAATC TGAGCAGGTA ATGTTTGGTG CTCTTGATAA GCTTTTCGAG 660 

AATACCAAGA TTAACCCTAG GGATATTGGT GTGTTGGTTG TGAATTGTAG CTTGTTTAAT 72 0 

CCTACACCTT CGTTGTCAGC TATGATTGTT AACAAGTATA AGCTTAGAGG GAATGTTAAG 780 

AGTTTTAACC TTGGTGGAAT GGGGTGTAGT GCTGGTGTTA TCTCTATCGA TTTAGCTAAA 84 0 

GATATGTTGC AAGTTCATAG GAATACTTAT GCTGTTGTGG TTAGTACTGA GAACATTACT 900 

CAGAATTGGT ATTTTGGGAA TAAGAAGGCT ATGTTGATTC CGAATTGTTT GTTTCGTGTT 960 

GGTGGTTCGG CGATTTTGTT GTCGAACAAG GGGAAAGATC GTAGACGGTC TAAGTATAAG 102 0 

CTTGTTCATA CCGTTAGGAC TCATAAAGGA GCTGTTGAGA AGGCTTTCAA CTGTGTTTAC 108 0 

CAAGAGCAAG ATGATAATGG GAAGACCGGG GTTTCGTTGT CGAAAGATCT TATGGCTATA 114 0 

GCTGGGGAAG CTCTTAAGGC GAATATCACT ACTTTAGGTC CTTTGGTTCT TCCTATAAGT 12 0 0 

GAGCAGATTC TGTTTTTCAT GACTTTGGTT ACGAAGAAAC TGTTTAACTC GAAGCTGAAG 12 6 0 

CCGTATATTC CGGATTTCAA GCTTGCGTTT GATCATTTCT GTATCCATGC TGGTGGTAGA 13 2 0 

GCTGTGATTG ATGAGCTTGA GAAGAATCTG CAGCTTTCGC AGACTCATGT CGAGGCATCC 13 80 

AGAATGACAC TGCACAGATT TGGAAACACT TCTTCGAGCT CGATTTGGTA TGAACTGGCT 144 0 

TACATAGAGG CTAAAGGTAG GATGAAGAAA GGAAACCGGG TTTGGCAGAT TGCTTTTGGA 1500 

AGTGGGTTTA AGTGTAACAG TGCAGTTTGG GTGGCTCTAA ACAATGTCAA GCCTTCGGTT 1560 

AGTAGTCCGT GGGAACACTG CAT CGACCGA TATCCGGTTA AGCTCGACTT C 1611 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 537 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



Ser 


Ser 


Tyr 


Val 


Arg 


Ala 


Phe 


He 


Cys 


Thr 


Asn 


Ser 


His 


Lys 


Val 


Phe 


1 








5 










10 










15 




Asn 


Phe 


He 


Pro 


Phe 


Phe 


Ser 


Glu 


Ala 


Met 


Glu 


Ala 


Ala 


Asn 


Glu 


Pro 








20 










25 










30 






Val 


Asn 


Gly 


Gly 


Ser 


Val 


Gin 


He 


Arg 


Thr 


Glu 


Asn 


Asn 


Glu 


Arg 


Arg 






35 










40 










45 








Lys 


Leu 


Pro 


Asn 


Phe 


Leu 


Gin 


Ser 


Val 


Asn 


Met 


Lys 


Tyr 


Val 


Lys 


Leu 


50 










55 










60 










Gly Tyr 


His 


Tyr 


Leu 


He 


Thr 


His 


Leu 


Phe 


Lys 


Leu 


Cys 


Leu 


Val 


Pro 


65 










70 










75 










80 


Leu 


Met 


Ala 


Val 


Leu 


Val 


Thr 


Glu 


He 


Ser 


Arg 


Leu 


Thr 


Thr 


Asp 


Asp 










85 










90 










95 




Leu 


Tyr 


Gin 


He 


Trp 


Leu 


His 


Leu 


Gin 


Tyr 


Asn 


Leu 


Val 


Ala 


Phe 


He 






100 










105 








110 






Phe 


Leu 


Ser 


Ala 


Leu 


Ala 


He 


Phe 


Gly 


Ser 


Thr 


Val 


Tyr 


He 


Met 


Ser 






115 










120 










125 








Arg 


Pro 


Arg 


Ser 


Val 


Tyr 


Leu 


Val 


Asp 


Tyr 


Ser 


Cys 


Tyr 


Leu 


Pro 


Pro 




130 










135 










140 










Glu 


Ser 


Leu 


Gin 


Val 


Lys 


Tyr 


Gin 


Lys 


Phe 


Met 


Asp 


His 


Ser 


Lys 


Leu 


145 










150 










155 










160 


lie 


Glu 


Asp 


Phe 


Asn 


Glu 


Ser 


Ser 


Leu 


Glu 


Phe 


Gin 


Arg 


Lys 


He 


Leu 








165 










170 










175 




Glu 


Arg 


Ser 


Gly 


Leu 


Gly 


Glu 


Glu 


Thr 


Tyr 


Leu 


Pro 


Glu 


Ala 


Leu 


His 



180 1B5 190 
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Cys 


He 


Pro 
195 


Pro 


Arg 


Pro 


Thr 


Met 
200 


Met 


Ala 


Ala 


Arg 


.Glu 
205 


Glu 


Ser 


Glu 


Gin 


Val 


Met 


Phe 


Gly 


Ala 


Leu 


Asp 


Lys 


Leu 


Phe 


Glu 


Asn 


Thr 


Lys 


He 




210 










215 










220 








Asn 


Pro 


Arg 


Asp 


He 


Gly 


Val 


Leu 


Val 


Val 


Asn 


Cys 


Ser 


Leu 


Phe 


Asn 


225 










230 










235 










240 


Pro 


Thr 


Pro 


Ser 


Leu 


Ser 


Ala 


Met 


He 


Val 


Asn 


Lys 


Tyr 


Lys 


Leu 


Arg 










245 










250 








255 


Gly Asn 


Val 


Lys 


Ser 


Phe 


Asn 


Leu 


Gly 


Gly 


Met 


Gly 


Cys 


Ser 


Ala 


Gly 








260 










265 










270 




Val 


He 


Ser 


He 


Asp 


Leu 


Ala 


Lys 


Asp 


Met 


Leu 


Gin 


Val 


His 


Arg 


Asn 






275 










280 










285 






Thr 


Tyr 


Ala 


Val 


Val 


Val 


Ser 


Thr 


Glu 


Asn 


He 


Thr 


Gin 


Asn 


Trp 


Tyr 




290 










295 










300 






Phe 


Gly 


Asn 


Lys 


Lys 


Ala 


Met 


Leu 


He 


Pro 


Asn 


Cys 


Leu 


Phe 


Arg 


Val 


305 










310 










315 








320 


Gly 


Gly 


Ser 


Ala 


He 


Leu 


Leu 


Ser 


Asn 


Lys 


Gly 


Lys 


Asp 


Arg 


Arg 


Arg 










325 










330 










335 


Ser 


Lys 


Tyr 


Lys 
340 


Leu 


Val 


His 


Thr 


Val 
345 


Arg 


Thr 


His 


Lys 


Gly 
350 


Ala 


Val 


Glu 


Lys 


Ala 
355 


Phe 


Asn 


Cys 


Val 


Tyr 
360 


Gin 


Glu 


Gin 


Asp 


Asp 
365 


Asn 


Gly 


Lys 


Thr 


Gly 


Val 


Ser 


Leu 


Ser 


Lys 


Asp 


Leu 


Met 


Ala 


He 


Ala 


Gly 


Glu 


Ala 




370 










375 










380 








Leu 


Lys 


Ala 


Asn 


He 


Thr 


Thr 


Leu 


Gly 


Pro 


Leu 


Val 


Leu 


Pro 


He 


Ser 


385 










390 










395 










400 


Glu 


Gin 


He 


Leu 


Phe 


Phe 


Met 


Thr 


Leu 


Val 


Thr 


Lys 


Lys 


Leu 


Phe 


Asn 










405 










410 








415 




Ser 


Lys 


Leu 


Lys 
420 


Pro 


Tyr 


He 


Pro 


Asp 
425 


Phe 


Lys 


Leu 


Ala 


Phe 
430 


Asp 


His 


Phe 


Cys 


He 


His 


Ala 


Gly 


Gly 


Arg 


Ala 


Val 


He 


Asp 


Glu 


Leu 


Glu 


Lys 






435 










440 










445 






Asn 


Leu 


Gin 


Leu 


Ser 


Gin 


Thr 


His 


Val 


Glu 


Ala 


Ser 


Arg 


Met 


Thr 


Leu 




450 










455 










460 








His 


Arg 


Phe 


Gly 


Asn 


Thr 


Ser 


Ser 


Ser 


Ser 


He 


Trp 


Tyr 


Glu 


Leu 


Ala 


465 










470 










475 










480 


Tyr 


He 


Glu 


Ala 


Lys 


Gly 


Arg 


Met 


Lys 


Lys 


Gly Asn 


Arg 


Val 


Trp 


Gin 










485 










490 










495 




He 


Ala 


Phe 


Gly 
500 


Ser 


Gly 


Phe 


Lys 


Cys 
505 


Asn 


Ser 


Ala 


Val 


Trp 
510 


Val 


Ala 


Leu 


Asn 


Asn 


Val 


Lys 


Pro 


Ser 


Val 


Ser 


Ser 


Pro 


Trp 


Glu 


His 


Cys 


He 






515 










520 










525 






Asp 


Arg 
530 


Tyr 


Pro 


Val 


Lys 


Leu 
535 


Asp 


Phe 

















(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1502 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TCTCCGACGA TGCCTCAGGC ACCGATGCCA GAGTTCTCTA GCTCGGTGAA 
GTGAAACTTG GTTACCAATA TTTGGTTAAC CATTTCTTGA GTTTTCTTTT 
ATGGCTATTG TCGCCGTTGA GCTTCTTCGG ATGGGTCCTG AAGAGATCCT 
AATTCACTCC AGTTTGACCT AGTTCAGGTT CTATGTTCTT CCTTCTTTGT 
TCCACTGTTT ACTTCATGTC CAAGCCACGC ACCATCTACC TCGTTGACTA 
AAGCCACCTG TCACGTGTCG TGTCCCCTTC GCAACTTTCA TGGAACACTC 
CTCAAGGACA AGCCTAAGAG CGTCGAGTTC CAAATGAGAA TCCTTGAACG 
GGTGAGGAGA CTTGTCTCCC TCCGGCTATT CATTATATTC CTCCCACACC 
GCGGCTAGAA GCGAGGCTCA GATGGTTATC TTCGAGGCCA TGGACGATCT 
ACCGGTCTTA AACCTAAAGA CGTCGACATC CTTATCGTCA ACTGCTCTCT 



GCTCAAGTAC 6 0 

GATCCCGATC 12 0 

TAATGTTTGG 180 

CATCTTCATC 24 0 

TTCTTGTTAC 3 00 

TCGTTTGATC 3 60 

TTCTGGCCTC 42 0 

AACCATGGAC 4 80 

TTTCAAGAAA 54 0 

TTTCTCTCCC 60 0 
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ACACCATCGC TCTCAGCTAT GGTCATCAAC AAATATAAGC TTAGGAGTAA TATCAAGAGC 66 0 

TTCAATCTTT CGGGGATGGG CTGCAGCGCG GGCCTGATCT CAGTTGATCT AGCCCGCGAC 72 0 

TTGCTCCAAG TTCATCCCAA TTCAAATGCA ATCATCGTCA GCACGGAGAT CATAACGCCT 780 

AATTACTATC AAGGCAACGA GAGAGCCATG TTGTTACCCA ATTGTCTCTT CCGCATGGGT 84 0 

GCGGCAGCCA TACACATGTC AAACCGCCGG TCTGACCGGT GGCGAGCCAA ATACAAGCTT 90 0 

TCCCACCTCG TCCGGACACA CCGTGGCGCT GACGACAAGT CTTTCTACTG TGTCTACGAA 96 0 

CAGGAAGACA AAGAAGGACA CGTTGGCATC AACTTGTCCA AAGATCTCAT GGCCATCGCC 102 0 

GGTGAAGCCC TCAAGGCAAA CATCACCACA ATAGGTCCTT TGGTCCTACC GGCGTCAGAA 108 0. 

CAACTTCTCT TCCTCACGTC CCTAATCGGA CGTAAAATCT TCAACCCGAA ATGGAAACCA 114 0 

TACATACCGG ATTTCAAGCT GGCCTTCGAA CACTTTTGCA TTCACGCAGG AGGCAGAGCG 120 0 

GTGATCGACG AGCTCCAAAA GAATCTACAA CTATCAGGAG AACACGTTGA GGCCTCAAGA 1260 

ATGACACTAC ATCGTTTTGG TAACACGTCA TCTTCATCGT TATGGTACGA GCTTAGCTAC 132 0 

ATCGAGTCTA AAGGGAGAAT GAGGAGAGGC GATCGCGTTT GGCAAATCGC GTTTGGGAGT 13 8 0 

GGTTTCAAGT GTAACTCTGC CGTGTGGAAG TGTAACCGTA CGATTAAGAC AC CTAAGGAC 144 0 

GGACCATGGT CCGATTGTAT CGACCGTTAC CCTGTCTTTA TTCCCGAAGT TGTCAAACTC 15 0 0 

TA 1502 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



Ser 


Pro 


Thr 


Met 


Pro 


Gin 


Ala 


Pro 


Met 


Pro 


Glu 


Phe 


Ser 


Ser 


Ser 


Val 


1 








5 










10 










15 




Lys 


Leu 


Lys 


Tyr 


Val 


Lys 


Leu 


Gly 


Tyr 


Gin 


Tyr 


Leu 


Val 


Asn 


His 


Phe 








20 










25 










30 






Leu 


Ser 


Phe 


Leu 


Leu 


He 


Pro 


He 


Met 


Ala 


He 


Val 


Ala 


Val 


Glu 


Leu 






35 










40 










45 








Leu 


Arg 


Met 


Gly 


Pro 


Glu 


Glu 


He 


Leu 


Asn 


Val 


Trp 


Asn 


Ser 


Leu 


Gin 




50 










55 










60 










Phe 


Asp 


Leu 


Val 


Gin 


Val 


Leu 


Cys 


Ser 


Ser 


Phe 


Phe 


Val 


He 


Phe 


He 


65 










70 










75 










80 


Ser 


Thr 


Val 


Tyr 


Phe 


Met 


Ser 


Lys 


Pro 


Arg 


Thr 


He 


Tyr 


Leu 


Val 


Asp 










85 










90 










95 




Tyr 


Ser 


Cys 


Tyr 


Lys 


Pro 


Pro 


Val 


Thr 


Cys 


Arg 


Val 


Pro 


Phe 


Ala 


Thr 








100 










105 










110 






Phe 


Met 


Glu 


His 


Ser 


Arg 


Leu 


He 


Leu 


Lys 


Asp 


Lys 


Pro 


Lys 


Ser 


Val 






115 










120 










125 








Glu 


Phe 


Gin 


Met 


Arg 


He 


Leu 


Glu 


Arg 


Ser 


Gly 


Leu 


Gly Glu 


Glu 


Thr 




130 










135 










140 










Cys 


Leu 


Pro 


Pro 


Ala 


He 


His 


Tyr 


He 


Pro 


Pro 


Thr 


Pro 


Thr 


Met 


Asp 


145 










150 










155 










160 


Ala 


Ala 


Arg 


Ser 


Glu 


Ala 


Gin 


Met 


Val 


He 


Phe 


Glu 


Ala 


Met 


Asp 


Asp 










165 










170 










175 


Leu 


Phe 


Lys 


Lys 


Thr 


Gly 


Leu 


Lys 


Pro 


Lys 


Asp 


Val 


Asp 


He 


Leu 


He 








180 










185 










190 






Val 


Asn 


Cys 


Ser 


Leu 


Phe 


Ser 


Pro 


Thr 


Pro 


Ser 


Leu 


Ser 


Ala 


Met 


Val 






195 










200 










205 








He 


Asn 


Lys 


Tyr 


Lys 


Leu 


Arg 


Ser 


Asn 


He 


Lys 


Ser 


Phe 


Asn 


Leu 


Ser 




210 










215 








220 










Gly 


Met 


Gly 


Cys 


Ser 


Ala 


Gly 


Leu 


He 


Ser 


Val 


Asp 


Leu 


Ala 


Arg 


Asp 


225 










230 










235 










240 


Leu 


Leu 


Gin 


Val 


His 


Pro 


Asn 


Ser 


Asn 


Ala 


He 


He 


Val 


Ser 


Thr 


Glu 










245 










250 










255 




He 


He 


Thr 


Pro 


Asn 


Tyr 


Tyr 


Gin 


Gly 


Asn 


Glu 


Arg 


Ala 


Met 


Leu 


Leu 








260 










265 










270 






Pro 


Asn 


Cys 


Leu 


Phe 


Arg 


Met 


Gly 


Ala 


Ala 


Ala 


He 


His 


Met 


Ser 


Asn 






275 










280 










285 








Arg 


Arg 


Ser 


Asp 


Arg 


Trp 


Arg 


Ala 


Lys 


Tyr 


Lys 


Leu 


Ser 


His 


Leu 


Val 



290 295 300 
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Arg 


Thr 


His 


Arg 


Gly 


Ala 


Asp 


Asp 


Lys 


Ser 


Phe 


Tyr 


Cys 


Val 


Tyr 


Glu 


305 










310 










315 










320 


Gin 


Glu 


Asp 


Lys 


Glu 


Gly 


His 


Val 


Gly 


He 


Asn 


Leu 


Ser 


Lys 


Asp 


Leu 










325 










330 










335 




Met 


Ala 


He 


Ala 


Gly 


Glu 


Ala 


Leu 


Lys 


Ala 


Asn 


He 


Thr 


Thr 


He 


Gly 








340 










345 










350 




Pro 


Leu 


Val 


Leu 


Pro 


Ala 


Ser 


Glu 


Gin 


Leu 


Leu 


Phe 


Leu 


Thr 


Ser 


Leu 






355 










360 










365 








He 


Gly 


Arg 


Lys 


He 


Phe 


Asn 


Pro 


Lys 


Trp 


Lys 


Pro 


Tyr 


He 


Pro 


Asp 




370 










375 










380 










Phe 


Lys 


Leu 


Ala 


Phe 


Glu 


His 


Phe 


Cys 


He 


His 


Ala 


Gly 


Gly 


Arg 


Ala 


385 










390 










395 










400 


Val 


He 


Asp 


Glu 


Leu 


Gin 


Lys 


Asn 


Leu 


Gin 


Leu 


Ser 


Gly 


Glu 


His 


Val 










4 US 










410 










415 




Glu 


Ala 


Ser 


Arg 


Met 


Thr 


Leu 


His 


Arg 


Phe 


Gly Asn 


Thr 


Ser 


Ser 


Ser 








420 










425 










430 






Ser 


Leu 


Trp 


Tyr 


Glu 


Leu 


Ser 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATGGACGGTG CCGGAGAATC ACGACTCGGT GGTGATGGTG GTGGTGATGG TTCTGTTGGA 60 

GTTCAGATCC GACAAACACG GATGCTACCG GATTTTCTCC AGAGCGTGAA TCTCAAGTAT 12 0 

GTGAAATTAG GTTACCATTA CTTAATCTCA AATCTCTTGA CTCTCTGTTT ATTCCCTCTC 180 

GCCGTTGTTA TCTCCGTCGA AGCCTCTCAG ATGAAC CC AG ATGATCTCAA ACAGCTCTGG 24 0 

ATCCATCTAC AATACAATCT GGTTAGTATC ATCATCTGTT CAGCGATTCT AGTCTTCGGG 300 

TTAACGGTTT ATGTTATGAC CCGACCTAGA CCCGTTTACT TGGTTGATTT CTCTTGTTAT 3 60 

CTCCCACCTG ATCATCTCAA AGCTCCTTAC GCTCGGTTCA TGGAACATTC TAGACTCACC 42 0 

GGAGATTTCG ATGACTCTGC TCTCGAGTTT CAACGCAAGA TCCTTGAGCG TTCTGGTTTA 4 80 

GGGGAAGACA CTTATGTCCC TGAAGCTATG CATTATGTTC CACCGAGAAT TTCAATGGCT 54 0 

GCTGCTAGAG AAGAAGCTGA ACAAGTCATG TTTGGTGCTT TAGATAACCT TTTCGCTAAC 6 00 

ACTAATGTGA AACCAAAGGA TATTGGAATC CTTGTTGTGA ATTGTAGTCT CTTTAATCCA 66 0 

ACTCCTTCGT TATCTGCAAT GATTGTGAAC AAGTATAAGC TTAGAGGTAA CATTAGAAGC 72 0 

TACAATCTAG GCGGTATGGG TTGCAGCGCG GGAGTTATCG CTGTGGATCT TGCTAAAGAC 7 80 

ATGTTGTTGG TACATAGGAA CACTTATGCG GTTGTTGTTT CTACTGAGAA CATTACTCAG 84 0 

AATTGGTATT TTGGTAACAA GAAATCGATG TTGATACCGA ACTGCTTGTT TCGAGTTGGT 900 

GGCTCTGCGG TTTTGCTATC GAACAAGTCG AGGGACAAGA GACGGTCTAA GTACAGGCTT 960 

GTACATGTAG TCAGGACTCA CCGTGGAGCA GATGATAAAG CTTTC CGTTG TGTTTATCAA 102 0 

GAGCAGGATG ATACAGGGAG AACCGGGGTT TCGTTGTCGA AAGATCTAAT GGCGATTGCA 10 80 

GGGGAAACTC TCAAAACCAA TATCACTACA TTGGGTCCTC TTGTTCTACC GATAAGTGAG 114 0 

CAGATTCTCT TCTTTATGAC TCTAGTTGTG AAGAAGCTCT TTAACGGTAA AGTGAAACCG 1200 

TATATCCCGG ATTTCAAACT TGCTTTCGAG CATTTCTGTA TCCATGCTGG TGGAAGAGCT 1260 

GTGATCGATG AGTTAGAGAA GAATCTGCAG CTTTC AC CAG TTCATGTCGA GGCTTCGAGG 13 2 0 

ATGACTCTTC ATCGATTTGG TAACACATCT TCGAGCTCCA TTTGGTATGA ATTGGCTTAC 13 8 0 

ATTGAAGCGA AGGGAAGGAT GCGAAGAGGT AATCGTGTTT GGCAAATCGC GTTCGGAAGT 144 0 

GGATTTAAAT GTAATAGCGC GATTTGGGAA GCATTAAGGC ATGTGAAACC TTCGAACAAC 15 0 0 

AGTCCTTGGG AAGATTGTAT TGACAAGTAT CCGGTAACTT TAAGTTAT 154 8 

(2) INFORMATION FOR SEQ ID NO: 14: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 516 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
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Thr Ser Ser Ser Ser lie Trp Tyr Glu Leu Ala Tyr lie Glu Ala Lys 

450 455 460 

Gly Arg Met Arg Arg Gly Asn Arg Val Trp Gin lie Ala Phe Gly Ser 
465 470 475 480 

Gly Phe Lys Cys Asn Ser Ala lie Trp Glu Ala Leu Arg His Val Lys 

485 490 495 

Pro Ser Asn Asn Ser Pro Trp Glu Asp Cys lie Asp Lys Tyr Pro Val 

500 505 510 

Thr Leu Ser Tyr 
515 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide encoding a polypeptide 
having an amino acid sequence selected from the group 
consisting of: an amino acid sequence substantially 
identical to SEQ ID NO : 2 , an amino acid sequence 
substantially identical to SEQ ID N0:4, an amino acid 
sequence substantially identical to SEQ ID NO : 6 , an amino 
acid sequence substantially identical to SEQ ID NO: 8, an 
amino acid sequence substantially identical to SEQ ID 

NO: 10, an amino acid sequence substantially identical to 
SEQ ID NO: 12, and an amino acid sequence substantially 
identical to SEQ ID NO : 14 . 

2. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO : 2 . 

3. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 4. 

4* The polynucleotide of claim 1, wherein said amino 

acid sequence is SEQ ID NO: 6. 

5. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 8. 

6. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 10. 

7. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 12. 

8. The polynucleotide of claim 1, wherein said amino 
acid sequence is SEQ ID NO: 14. 



BNSDOCID: <WO 9854954A1 _!__:» 



WO 98/54954 



PCT/US98/11384 



- 52 



9. An isolated polynucleotide, wherein said 

polynucleotide is selected from the group consisting of: 

a) SEQ ID NO:l, 

b) SEQ ID NO: 3 

c) SEQ ID NO: 5, 

d) SEQ ID NO:7 ( 

e) SEQ ID NO: 9, 

f) SEQ ID NO: 11; 

g) SEQ ID NO: 13; 

h) an RNA analog of SEQ ID NO : 1 

i) an RNA analog of SEQ ID NO : 3 
j) an RNA analog of SEQ ID NO : 5 , 
k) an RNA analog of SEQ ID NO : 7 , 
1) an RNA analog of SEQ ID NO : 9 , 
m) an RNA analog of SEQ ID NO: 11; 
n) an RNA analog of SEQ ID NO: 13; 

o) a polynucleotide having a nucleic acid sequence 
complementary to a) , b) , c) , d) , e) , f ) , g) , h) , i) , j ) , 
k) , 1) , m) , or n) ; and 

p) a nucleic acid fragment of a) , b) , c) , d) , e) , 
f ) , g) , h) , i) , j ) , k) , 1) , m) , n) , or o) that is at 
least 15 nucleotides in length and that hybridizes under 
stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID NO: 2, SEQ ID NO : 4 , SEQ ID NO : 6 , SEQ 
ID NO: 8, SEQ ID NO: 10, SEQ ID NO : 12 , or SEQ ID NO: 14. 



10. An isolated polypeptide having an amino acid 

sequence selected from the group consisting of: an amino 
acid sequence substantially identical to SEQ ID NO:2, an 
amino acid sequence substantially identical to SEQ ID 
NO: 4, an amino acid sequence substantially identical to 
SEQ ID NO: 6, an amino acid sequence substantially 
identical to SEQ ID NO: 8, an amino acid sequence 
substantially identical to SEQ ID NO: 10, an amino acid 
sequence substantially identical to SEQ ID NO: 12, and an 



BNSDOCID: <WO 9854954A1 I > 



WO 98/54954 



PCT/US98/11384 



- 53 - 

amino acid sequence substantially identical to SEQ ID 
NO: 14 . 

11. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO:2. 

12. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO:4. 

13. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO : 6 . 

14. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO: 8. 

15. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO: 10. 

16. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO: 12. 

17. The polypeptide of claim 10, wherein said amino 
acid sequence is SEQ ID NO: 14. 

18 . A transgenic plant containing a nucleic acid 

construct comprising a polynucleotide selected from the 
group consisting of: 

a) SEQ ID NO:l, 

b) SEQ ID NO: 3 

c) SEQ ID NO: 5, 

d) SEQ ID NO: 7, 

e) SEQ ID NO: 9, 

f) SEQ ID NO: 11; 

g) SEQ ID NO: 13; 

h) an RNA analog of SEQ ID NO : 1 ; 
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i) 


an RNA analog of SEQ ID NO: 3; 




j) 


an RNA analog of SEQ ID NO: 5; 




JO 


an RNA analog of SEQ ID NO: 7; 




1) 


an RNA analog of SEQ ID NO: 9; 




m) 


an RNA analog of SEQ ID NO: 11; 




n) 


an RNA analog of SEQ ID NO: 13; 




o) 


a polynucleotide having a nucleic acid 


sequence 


complementary to a) , b) , c) , d) , e) , f ) , g) , h) , 


i) , j) , 


k) , 1) , m) 


, or n) ; and 




P) 


a nucleic acid fragment of a) , b) , c) , 


d) , e) , 


f ) , g) , h) 


, i) , j) , k) , 1) , m) , n) , or o) that is 


at 



least 15 nucleotides in length and that hybridizes under 
stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO : 6 , SEQ 
ID NO: 8, SEQ ID NO: 10, SEQ ID NO : 12 , or SEQ ID NO: 14. 

19. The plant of claim 18, wherein said construct 
further comprises a regulatory element operably linked to 
said polynucleotide. 

20. The plant of claim 19, wherein said regulatory 
element is a tissue-specific promoter. 

21. The plant of claim 20, wherein said regulatory 
element is an epidermal cell -specif ic promoter. 

22. The plant of claim 20, wherein said regulatory 
element is a seed-specific promoter that is operably 
linked in sense orientation to said polynucleotide. 

23. The plant of claim 22, wherein said plant has 
altered levels of very long chain fatty acids in seeds 
compared to the levels in a plant lacking said nucleic 
acid construct. 
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24. A transgenic plant containing a nucleic acid 
construct comprising a polynucleotide encoding a 
polypeptide selected from the group consisting of: an 
amino acid sequence substantially identical to SEQ ID 
NO:2, an amino acid sequence substantially identical to 
SEQ ID NO: 4, an amino acid sequence substantially 
identical to SEQ ID NO : 6 , an amino acid sequence 
substantially identical to SEQ ID NO: 8, an amino acid 
sequence substantially identical to SEQ ID NO: 10, an 
amino acid sequence substantially identical to SEQ ID 

NO: 12, and an amino acid sequence substantially identical 
to SEQ ID NO: 14 . 

25. The plant of claim 24, wherein said construct 
further comprises a regulatory element operably linked to 
said polynucleotide . 

26. The plant of claim 25, wherein said regulatory 
element is a tissue-specific promoter. 

27. The plant of claim 26, wherein said regulatory 
element is an epidermal cell -specific promoter, 

28. The plant of claim 26, wherein said regulatory 
element is a seed- specif ic promoter that is operably 
linked in sense orientation to said polynucleotide. 

29. The plant of claim 28, wherein said plant has 
altered levels of very long chain fatty acids in seeds 
compared to the levels in a plant lacking said nucleic 
acid construct . 

30. A method of altering the levels of very long chain 
fatty acids in a plant, comprising the steps of: 
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A) creating a nucleic acid construct, said construct 
comprising a polynucleotide selected from the group 
consisting of: 

a) SEQ ID N0:1, 

b) SEQ ID NO: 3 

c) SEQ ID NO: 5, 

d) SEQ ID NO: 7, 

e) SEQ ID NO: 9, 

f) SEQ ID NO: 11; 

g) SEQ ID NO: 13; 

h) an RNA analog of SEQ ID NO:l. 

i) an RNA analog of SEQ ID NO: 3 
j) an RNA analog of SEQ ID NO : 5 ( 
k) an RNA analog of SEQ ID NO : 7 , 
1) an RNA analog of SEQ ID NO : 9 , 
m) an RNA analog of SEQ ID NO: 11; 
n) an RNA analog of SEQ ID NO; 13; 

o) a polynucleotide having a nucleic acid sequence 
complementary to a) , b) , c) , d) , e) , f ) , g) , h) , i) , j ) , 
k) , 1) , m) , or n) ; and 

p) a nucleic acid fragment of a) , b) , c) , d) , e) , 
f ) , g) , h) , i) , j), k) , 1) , m) , n) , or o) that is at 
least 15 nucleotides in length and that hybridizes under 
stringent conditions to genomic DNA encoding the 
polypeptide of SEQ ID NO : 2 , SEQ ID NO: 4, SEQ ID NO : 6 , SEQ 
ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID NO: 14; and 

B) introducing said construct into said plant, wherein 
said polynucleotide is effective for altering the levels 
of very long chain fatty acids in said plant . 
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Figure 2 
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ELI 1560 bases 
ATGG AT C GAG AGAGATTAAC 
AGAATTCGAA GACGTTTGCC 
GGACTTCACA ACTCTTGCAA 
ACCGGAACCG TGCTGGTTCA 
TCTAACCAGG CGGTTCAACT 
TTCGTTTTGA CCCTCTACGT 
TGCTACAAAC CGGAAGACGA 
GAAAATGGAT CATTCACCGA 
GGTTTGGGAG ACGAGACGTA 
ATGTCAGAGG CACGTGCCGA 
GAGAAAACCG GAATTAAACC 
AATCCGACGC CGTCTCTATC 
AAAAGTTACA ACCTCGGAGG 
AACAATCTCC TCAAAGCAAA 
ACCCTAAACT GGTACTTCGG 
ATGGGCGGAG CTGCGATTCT 
TCGCTGGTCA ACGTCGTTCG 
TACCAGAAGG AAGACGAGAG 
GTCGCCGGAG ACGCTCTGAA 
TCAGAGCAGT TGATGTTCTT 
AAACCGTATA TTCCGGATTT 
AGAGCGGTTC TAGACGAAGT 
TCTAGAATGA C TTTG C AC AG 
GCTTATACCG AAGCTAAGGG 
GGATCGGGTT TCAAGTGTAA 
GAGATGACCG GTAATGCTTG 



GGCGGAGATG GCGTTTCGAG 
GGATTTATTA ACGTCCGTTA 
CGTGACCACC ATTCTCTTCT 
GCTAACCGGT CTAACGTTCG 
CGACACGGCG ACGAGACTTA 
GGCTAACCGG TCTAAACCGG 
GCGTAAAATA TCAGTAGATT 
TGACACGGTT CAGTTCCAGC 
TCTGCCACGT GGCATAACTT 
AGCTGAAGCC GTTATGTTTG 
GGCCGAAGTC GGAATCTTGA 
AGCGATGATC GTGAACCATT 
AATGGGTTGC TCCGCCGGAT 
CCCTAATTCT TACGCTGTCG 
AAATGACCGG TCAATGCTCC 
CCTCTCTAAC CGCCGTCAAG 
AACACATAAA GGATCAGACG 
AGGAACAATC GGTGTCTCTT 
AACAAACATC ACGACTTTAG 
GATTTCCTTG GTCAAAAGGA 
CAAGCTAGCT TTCGAGCATT 
GCAGAAGAAT CTTGATCTCA 
ATTTGGTAAC ACTTCGAGTA 
TCGGGTTAAA GCTGGTGACC 
TAGTGCGGTT TGGAAAGCGT 
GGCTGGTTCG ATTGATCAAT 

ELI 
FIGURE* 3 



ATTCATCATC GGCCGTTATA 
AGCTCAAATA CGTGAAGCTT 
TCTTAATTAT TCTTCCTTTA 
ATACGTTCTC TGAGCTTTGG 
GCTGCTTGGT TTTCCTCTCC 
TTTACCTAGT GGATTTCTCC 
CGTTCTTGAC GATGACTGAG 
AAAGAATCTC GAACCGGGCC 
CAACGCCCCC GAAGCTAAAT 
GAGCCTTAGA TTCCCTCTTC 
TAGTAAACTG CAGCTTATTC 
ACAAGATGAG AG AAG AC AT C 
TAATCTCAAT CGATCTCGCT 
TGGTAAGCAC GGAAAACATA 
TCTGCAACTG CATCTTCCGA 
ACCGGAAGAA GTCAAAGTAC 
ACAAGAACTA CAATTGCGTG 
TAGCTAGAGA GCTCATGTCT 
GACCGATGGT TCTTCCATTG 
AG AT GT T C AA GTTAAAAGTT 
TCTGTATTCA CGCAGGAGGT 
AAGATTGGCA C ATGG AAC CT 
GCTCGCTTTG GTATGAGATG 
GACTTTGGCA GATTGCGTTT 
TACGACCGGT TTCGACGGAG 
ATCCGGTTAA AGTTGTGCAA 
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ELI sequence 

Molecular Weight 58379.00 Daltons 
52 0 Amino Acids 

62 Strongly Basic (+) Amino Acids (K,R) 
52 Strongly Acidic (») Amino Acids <D,E) 
187 Hydrophobic Amino Acids (A, I , L, F, W, V) 
144 Polar Amino Acids (N, C, Q, S , T, Y) 
8.784 Isolectric Point 
10.804 Charge at PH 7 . 0 



MDRERLTAEM AFRDSSSAVI RIRRRLPDLL 
TGTVLVQLTG LTFDTFSELW SNQAVQLDTA 
CYKPEDERKI SVDSFLTMTE ENGSFTDDTV 
MSEARAEAEA VMFGALDSLF EKTGI KPAEV 
KSYNLGGMGC SAGLISIDLA MMLLKANPNS 
MGGAA1LLSN RRQDRKKSKY SLVNWRTHK 
VAGDALKTNI TTLGPMVLPL SEQLMFLISL 
RAVLDEVQKN LDLKDWHMEP SRMTLHRFGN 
GSGFKCNSAV WKALRPVSTE EMTGNAWAGS 



TSVKLKYVKL GLHNSCNVTT ILFFLIILPL 
TRLTCLVFLS FVLTLYVANR SKPVYLVDFS 
QFQQRISNRA GLGDETYLPR GITSTPPKLN 
GILIVNCSLF NPTPSLSAMI VNHYKMREDI 
YAWVSTENI TLNWYFGNDR SMLLCNCIFR 
GSDDKNYNCV YQKEDERGTI GVSLARELMS 
VKRKMFKLKV KPYIPDFKLA FEHFCIHAGG 
TSSSSLWYEM AYTEAKGRVK AGDRLWQIAF 
IDQYPVKWQ 



FIGURE 4 
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EL2 1479 bases 

ATGGATTACC CCATGAAGAA GGTAAAAATC TTTTTCAACT ACCTCATGGC GCATCGCTTC 
AAGCTCTGCT TCTTACCATT AATGGTTGCT ATAGCCGTGG AGGCGTCTCG TCTTTCCACA 12 0 
CAAGATCTCC AAAACTTTTA CCTCTACTTA CAAAACAACC ACACATCTCT AACCATGTTC 
TTCCTTTACC TCGCTCTCGG GTCGACTCTT TACCTCATGA CCCGGCCCAA ACCCGTTTAT 24 0 
CTCGTTGACT TTAGCTGCTA CCTCCCACCG TCGCATCTCA AAGCCAGCAC CCAGAGGATC 
ATGCAACACG TAAGGCTTGT ACGAGAAGCA GGCGCGTGGA AGCAAGAGTC CGATTACTTG 360 
ATGGACTTCT GCGAGAAGAT TCTAGAACGT TCCGGTCTAG GCCAAGAGAC GTACGTACCC 
GAAGGTCTTC AAACTTTGCC ACTACAACAG AATTTGGCTG TAT CACGTAT AGAGACGGAG 4 80 
GAAGTTATTA TTGGTGCGGT CGATAATCTG TTTCGCAACA CGGGAATAAG CCCTAGTGAT 
ATAGGTATAT TGGTGGTGAA TTCAAGCACT TTTAAT CCAA CACCTTCGCT ATCAAGTATC 60 0 
TTAGTGAATA AGTTTAAACT TAGGGATAAT ATAAAGAGCT TGAATCTTGG TGGGATGGGG 
TGTAGCGCTG GAGTCATCGC TATCGATGCG GCTAAGAGCT TGTTACAAGT TCATAGAAAC 72 0 
ACTTATGCTC TTGTGGTGAG CACGGAGAAC ATCACTCAAA ACT TGTAC AT GGGTAACAAC 
AAATCAATGT TGGTTACAAA CTGTTTGTTC CGTATAGGTG GGGCCGCGAT TTTGCTTTCT 840 
AACCGGTCTA TAGATCGTAA ACGCGCAAAA TACGAGCTTG TTCACACCGT GCGGGTCCAT 
ACCGGAGCAG ATGACCGATC CTATGAATGT GCAACTCAAG AAGAGGATGA AGATGGCATA 96 0 
GTTGGGGTTT CCTTGTCAAA GAATCTACCA ATGGTAGCTG CAAGAACCCT AAAGATCAAT 
ATCGCAACTT TGGGTCCGCT TGTTCTTCCC ATAAGCGAGA AGTTTCACTT CTTTGTGAGG 10 8 0 
TTCGTTAAAA AGAAGTTTCT CAACCCCAAG CTAAAGCATT ACATTCCGGA TTTCAAGCTC 
GCATTCGAGC ATTTCTGTAT CCATGCGGGT GGTAGAGCGC TAATTGATGA GATGGAGAAG 12 0 0 
AATCTTCATC TAACTCCACT AGACGTTGAG GCTTCAAGAA TGACATTACA CAGGTTTGGT 
AATACCTCTT CGAGCTCCAT TTGGTACGAG TTGGCTTACA CAGAAGCCAA AGGAAGGATG 132 0 
ACGAAAGGAG ATAGGATTTG GCAGATTGCG TTGGGGTCAG GTTTTAAGTG TAATAGTTCA 
GTTTGGGTGG CTCTTCGTAA CGTCAAGCCT TCTACTAATA ATCCTTGGGA ACAGTGTCTA 144 0 
CACAAATATC CAGTTGAGAT CGATATAGAT TTAAAAGAG 

EL2 
FIGURE 5 
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EL2 protein sequence 

Molecular Weight 55799.30 Daltons 

4 93 Amino Acids 

55 Strongly Basic (+) Amino Acids <K,R) 
46 Strongly Acidic (-) Amino Acids <D,E) 
181 Hydrophobic Amino Acids {A, I , L, F, W, V) 
134 Polar Amino Acids (N, C, Q, S, T, Y) 
8.756 Isolectric Point 
10.995 Charge at PH 7 . 0 

MDYPMKKVKI FFNYLMAHRF KLCFLPLMVA IAVEASRLST QDLQNFYLYL QNNHTSLTMF FLYIiALGSTL 
YLMTRPKPVY LVDFSCYXxPP SHLKASTQRI MQHVRLVREA GAWKQESDYL MDFCEKILER SGLGQETYVP 
EGLQTLPLQQ NLAVSRIETE EVIIGAVDNL FRNTGISPSD IGILWNSST FNPTPSLSSI LVNKFKLRDN 
IKSLNXiGGMG CSAGVIAIDA AKSLLQVHRN TYALWSTEN I TQNLYMGNN KSMLVTNCLF RIGGAAILL.S 
NRSIDRKRAK YE LVHTVRVH TGADDRSYEC ATQEEDEDGI VGVSLSKNLP MVAARTLKIN IATLGPLVLP 
ISEKFHFFVR FVKKKFLNPK LKHYIPDFKL AFEHFCIHAG GRALIDEMEK NLHLTPLDVE ASRMTLHRFG 
NTSSSSIWYE LAYTEAKGRM TKGDRIWQIA LGSGFKCNSS VWVALRNVKP STNNPWEQCL HKYPVEIDID 
LKE 

FIGURE 6* 
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EL3 1512 bases 

CTACGTCAGG GTAGAACAAA GAGTAAACAC TTAAGCAAAA CAATTTGTCC TACTCTTAGG TTATCTCCAA 
TGAAGAACTT AAAGATGGTT TTCTTCAAGA TCCTCTTTAT CTCTTTAATG GCAGGATTAG CCATGAAAGG 
ATCTAAGATC AACGTAGAAG ATCTCCAAAA GTTCTCCCTC CACCATACAC AGAACAACCT C C AAAC CAT A 
AGCCTTCTAT TGTTTCTTGT CGTTTTTGTG TGGATCCTCT ACATGTTAAC CCGACCTAAA CCCGTTTACC 
TTGTTGATTT CTCCTGCTAC CTTCCACCGT CGCATCTCAA GGTCAGTATC CAAACCCTAA TGGGACACGC 
AAGACGTGCA AGAGAAGCAG GCATGTGTTG GAAGAACAAA GAGAGCGACC ATTTAGTTGA CTTCCAGGAG 
AAGATTCTTG AACGTTCCGG TCTTGGTCAA GAAACCTACA TCCCCGAGGG TCTTCAGTGC TTCCCACTTC 
AGCAAGGCAT GGGTGCTTCA CGTAAAGAGA CGGAAGAAGT AATCTTCGGA GCTCTTGACA ATCTTTTTCG 
CAACACCGGT GTAAAACCTG ATGATATCGG TATATTGGTG GTGAATTCTA GCACGTTTAA TCCAACTCCA 
TCACTCGCCT CCATGATTGT GAACAAGTAC AAACTCAGAG ACAACATCAA GAGTTTGAAT CTTGGAGGGA 
TGGGTTGCAG TGCCGGAGTT ATAGCTGTTG ATGTCGCTAA GGGATTACTA CAAGTTCATA GGAACACTTA 
TGCTATTGTA GTAAGCACAG AGAACATCAC TCAGAACTTA TACTTGGGGA AAAACAAATC AATGCTAGTC 
ACAAACTGTT TGTTCCGCGT TGGTGGTGCT GCGGTTCTGC TTTCAAACAG ATCTAGAGAC CGTAAC CGCG 
CCAAATACGA GCTTGTTCAC ACCGTACGGA TCCATACCGG ATCAGATGAT AGGTCGTTCG AATGTGCGAC 
ACAAGAAGAG GATGAAGATG GTATAATTGG AGTTACCTTG ACAAAGAATC TACCTATGGT GGCTGCAAGG 
ACTCTTAAGA TAAATATCGC AACTTTGGGT CCTCTTGTAC TTCCATTAAA AGAGAAGCTA GCCTTCTTTA 
TTACTTTTGT CAAGAAGAAG TATTTCAAGC CAGAGTTAAG GAATTATACA CCAGATTTCA AGCTTGCCTT 
TGAGCATTTC TGTATCCACG CTGGTGGAAG AGCTCTAATA GATGAGCTGG AGAAGAACCT TAAGCTTTCT 
CCGTTACACG TAG AGG CGT C AAGAATGACA CTACACAGGT TTGGTAACAC TTCTTCTAGC TCAATCTGGT 
ACGAGTTAGC TTATACAGAA GCTAAAGGAA GGATGAAGGA AGGAGATAGG ATTTGG C AG A TTGCTTTGGG 
GTCAGGTTTT AAGTGTAACA GTTCAGTATG GGTGGCTCTG CGAGACGTTA AGCCTTCAGC TAACAGTCCA 
TGGGAAGACT GTATGGATAG ATATCCGGTT GAGATTGATA TT 

EL3 
FIGURE 7 



BNSDOCiD- <WO 9854954A1_L> 



WO 98/54954 



PCT/US98/1 1384 



8/16 



EL3 protein sequence 

Molecular Weight 56801.10 Daltons 

504 Amino Acids 

66 Strongly Basic (+) Amino Acids (K,R) 
48 Strongly Acidic (-) Amino Acids (D, E) 
183 Hydrophobic Amino Acids (A, I , L, F, W, V) 
127 Polar Amino Acids (N, C, Q, S , T, Y) 
9.315 Isolectric Point 
19.797 Charge at PH 7.0 



LRQGRTKSKH 
SLLLFLWFV 
KILERSGLGQ 
SLASMIVNKY 
TNCIjFRVGGA 
TLKINIATLG 
PLHVEASRMT 
WEDCMDRYPV 



LSKTICPTLR 
WILYMLTRPK 
ETYIPEGLQC 
KLRDNIKSLN 
AVLLSNRSRD 
PLVLPLKEKL 
LHRFGNTSSS 
EIDI 



LSPMKNLKMV 
PVYLVDFSCY 
FPLQQGMGAS 
LGGMGCSAGV 
RNRAKYELVH 
AFFITFVKKK 
SIWYEIAYTE 



FFKILFISLM 
LPPSHLKVSI 
RKETEEVIFG 
IAVDVAKGLL 
TVRIHTGSDD 
YFKPELRNYT 
AKGRMKEGDR 



AGLAMKGSKI 
QTLMGHARRA 
ALDNLFRNTG 
QVHRNTYAIV 
RSFECATQEE 
PDFKLAFEHF 
IWQIALGSGF 



NVEDLQKFSL 
REAGMCWKNK 
VKPDDIGILV 
VSTENITQNL 
DEDGIIGVTL 
CIHAGGRALI 
KCNSSVWVAL 



HHTQNNLQTI 
ESDHLVDFQE 
VNSSTFNPTP 
YLGKNKSMLV 
TKNLPMVAAR 
DELEKNLKLS 
RDVKPSANSP 
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EL4 cDNA 1650 bases 

ATGGGTAGAT CCAACGAGCA AGATCTGCTC TCTAC CGAGA TCGTTAATCG TGGGATCGAA CCATCCGGTC 
CTAACGCCGG CTCACCAACG TTCTCGGTTA GGGTCAGGAG ACGTTTGCCT GATTTTCTTC AGTCGGTGAA 
CTTGAAGTAC GTGAAACTTG GTTACCACTA CCTCATAAAC CATGCGGTTT ATTTGGCGAC CATACCGGTT 
CTTGTGCTGG TTTTTAGTGC TGAGGTTGGG AGTTTAAGCA GAGAAGAGAT TTGGAAGAAG CTTTGGGACT 
ATGATCTTGC AACTGTTATC GGATTCTTCG GTGTCTTTGT TTTAACCGCT TGTGTCTACT TCATGTCTCG 
TCCTCGCTCT GTTTATCTTA TTGATTTCGC TTGTTACAAG CCCTCCGATG AACACAAGGT GACAAAAGAA 
GAGTTCATAG AACTAGCGAG AAAATCAGGG AAGTTCGACG AAGAGACACT CGGTTTCAAG AAGAGGATCT 
TACAAGCCTC AGGCATAGGC GACGAGACAT ACGTCCCAAG ATCCATCTCT TCATCAGAAA ACATAACAAC 
GATGAAAGAA GGTCGTGAAG AAGCCTCTAC AGTGATCTTT GGAGCACTAG ACGAACTCTT CGAGAAGACA 
CGTGTAAAAC CTAAAGACGT TGGTGTCCTT GTGGTTAACT GTAGCATTTT CAACCCGACA CCGTCGTTGT 
CCGCAATGGT GATAAACCAT TACAAGATGA GAGGGAACAT ACTTAGTTAC AAC CTTGGAG GGATGGGATG 
TTCGGCTGGA ATCATAGCTA TTGATCTTGC TCGTGACATG CTTCAGTCTA ACCCTAATAG TTATGCTGTT 
GTTGTGAGTA CTGAGATGGT TGGGTATAAT TGGTACGTGG GAAGTGACAA GTCAATGGTT ATACCTAATT 
GTTTCTTTAG GATGGGTTGT TCTGCCGTTA TGCTCTCTAA CCGTCGTCGT GACTTTCGCC ATGCTAAGTA 
CCGTCTCGAG CACATTGTCC GAACTCATAA GGCTGCTGAC GACCGTAGCT TCAGGAGTGT GTACCAGGAA 
GAAGATGAAC AAGGATTCAA GGGGTTGAAG ATAAGTAGAG ACTTAATGGA AGTTGGAGGT GAAGCTCTCA 
AGACAAACAT CACTACCTTA GGTCCTCTTG TCCTACCTTT CTCCGAGCAG CTTCTCTTCT TTGCTGCTTT 
GGTCCGCCGA ACATTCTCAC CTGCTGCCAA AACGTCCACA ACCACTTCCT TCTCTACTTC CGCCACCGCA 
AAAACCAATG GAATCAAGTC TTCCTCTTCC GATCTGTCCA AG CCATACAT CCCGGACTAC AAGCTCGCCT 
TCGAGCATTT TTGCTTCCAC GCGGCAAGCA AAGTAGTGCT TGAAGAGCTT CAAAAGAATC TAGGCTTGAG 
TGAAGAGAAT ATGGAGGCTT CTAGGATGAC ACTTCACAGG TTTGGAAACA CTTCTAGCAG TGGAATCTGG 
TATGAGTTGG CTTACATGGA GGCCAAGGAA AGTGTTCGTA G AGG C GAT AG GGTTTGGCAG ATCGCTTTCG 
GTTCTGGTTT TAAGTGTAAC AGTGTGGTGT GGAAGGCAAT GAGGAAGGTG AAGAAGCCAA CCAGGAACAA 
TCCTTGGGTG GATTGCATCA ACCGTTACCC TGTGCCTCTC 



EL4 
FIGURE 9 
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EL4 protein sequence 

Molecular Weight 61953.80 Daltons 

550 Amino Acids 

71 Strongly Basic (+) Amino Acids (K,R) 
58 Strongly Acidic (-) Amino Acids (D , E) 
191 Hydrophobic Amino Acids (A, I , L , F, W, V) 
147 Polar Amino Acids (N, C, Q, S , T, Y) 
9.036 Isolectric Point 
14.349 Charge at PH 7.0 

MGRSNEQDLL STEIVNRGIE PSGPNAGSPT FSVRVRRRLP DFLQSVNLKY VKLGYHYLIN HAVYLATIPV 
LVLVFSAEVG SLSREEIWKK LWDYDLATVI GFFGVFVLTA CVYFMSRPRS VYLIDFACYK PSDEHKVTKE 
E F I ELARKSG KFDEETLGFK KRILQASGIG DETYVPRSIS- SSENITTMKE GREEASTVIF GALDELFEKT 
RVKPKDVGVL WNCSIFNPT PSLSAMVINH YKMRGNILSY NLGGMGCSAG IIAIDLARDM LQSNPNSYAV 
WSTEMVGYN WYVGSDKSMV IPNCFFRMGC SAVMLSNRRR DFRHAKYRLE HIVRTHKAAD DRSFRSVYQE 
EDEQGFKGLK I SRDLMEVGG EALKTNITTL GPLVLPFSEQ LLFFAALVRR TFSPAAKTST TTSFSTSATA 
KTNGIKSSSS DLSKPYIPDY KLAFEHFCFH AASKWLEEL QKNLGLSEEN MEASRMTLHR FGNTSSSGIW 
YELAYMEAKE SVRRGDRVWQ IAFGSGFKCN SWWKAMRKV KKPTRNNPWV DCINRYPVPL 



EL4 
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EL 5 cDNA 1611 bases 

TCGAGCTACG TCAGGGCTTT TATATGCACA AATTCTCATA AAGTTTTCAA TTTTATTCCA TTTTTCTCGG 
AAGCCATGGA AG CTG CTAAT GAG C CTGTT A ATGGCGGATC CGTACAGATC CGAACAGAGA ACAACGAAAG 
ACGAAAGCTT CCTAATTTCT TACAAAGCGT CAACATGAAA TACGTCAAGC TAGGTTATCA TTACCTCATT 
ACTCATCTCT TCAAGCTCTG TTTGGTTCCA TTAATGGCGG TTTTAGTCAC AGAGATCTCT CGATTAACAA 
CAGACGATCT TTACCAGATT TGGCTTCATC TCCAATACAA TCTCGTTGCT TTCATCTTTC TCTCTGCTTT 
AGCTATCTTT GGCTCCACCG TTTACATCAT GAGTCGTCCC AGATCTGTTT ATCTCGTTGA TTACTCTTGT 
TATCTTCCTC CGGAGAGTCT TCAGGTTAAG TATCAGAAGT TTATGGATCA TTCTAAGTTG ATTGAAGATT 
TCAATGAGTC ATCTTTAGAG TTTCAGAGGA AGATTCTTGA ACGTTCTGGT TTAGGAGAAG AGACTTATCT 
CCCTGAAGCT TTACATTGTA TCCCTCCGAG GCCTACGATG ATGGCGGCTC GTGAGGAATC TGAGCAGGTA 
ATGTTTGGTG CTCTTGATAA GCTTTTCGAG AATACCAAGA TTAACCCTAG GGATATTGGT GTGTTGGTTG 
TGAATTGTAG CTTGTTTAAT CCTACACCTT CGTTGTCAGC TATGATTGTT AACAAGTATA AGCTTAGAGG 
GAATGTTAAG AGTTTTAACC TTGGTGGAAT GGGGTGTAGT GCTGGTGTTA TCTCTATCGA TTTAGCTAAA 
GATATGTTGC AAGTT CATAG GAATACTTAT GCTGTTGTGG TTAGTACTGA GAACATTACT CAGAATTGGT 
ATTTTGGGAA TAAGAAGGCT ATGTTGATTC CGAATTGTTT GTTTCGTGTT GGTGGTTCGG CGATTTTGTT 
GTCGAACAAG GGGAAAGATC GTAGACGGTC TAAGTATAAG CTTGTTCATA CCGTTAGGAC TCATAAAGGA 
GCTGTTGAGA AGGCTTTCAA CTGTGTTTAC CAAGAGCAAG ATGATAATGG GAAGACCGGG GTTTCGTTGT 
CGAAAGATCT TATGGCTATA GCTGGGGAAG CTCTTAAGGC GAATATCACT ACTTTAGGTC CTTTGGTTCT 
TCCTATAAGT GAGCAGATTC TGTTTTTCAT GACTTTGGTT ACGAAGAAAC TGTTTAACTC GAAGCTGAAG 
CCGTATATTC CGGATTTCAA GCTTGCGTTT GATCATTTCT GTATCCATGC TGGTGGTAGA GCTGTGATTG 
ATGAGCTTGA GAAGAATCTG CAGCTTTCGC AGACTCATGT. CGAGGCATCC AGAATGACAC TGCACAGATT 
TGGAAACACT TCTTCGAGCT CGATTTGGTA TGAACTGGCT TACATAGAGG CTAAAGGTAG GATGAAGAAA 
GGAAACCGGG TTTGGCAGAT TGCTTTTGGA AGTGGGTTTA AGTGTAACAG TGCAGTTTGG GTGGCTCTAA 
ACAATGTCAA GCCTTCGGTT AGTAGTCCGT GGGAACACTG CATCGACCGA TATCCGGTTA AGCTCGACTT 
C 



ELS 
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ELS protein sequence 

Molecular Weight 60874,60 Daltons 

537 Amino Acids 

63 Strongly Basic {+) Amino Acids (K,R) 
47 Strongly Acidic (-) Amino Acids (D,E) 
198 Hydrophobic Amino Acids (A, I , L, F, W, V) 
148 Polar Amino Acids (N, C, Q, S, T, Y) 
9.107 Isolectric Point 
17.93 0 Charge at PH 7.0 

SSYVRAFICT NSHKVFNFIP FFSEAMEAAN EPVNGGSVQI RTENNERRKL PNFLQSVNMK YVKLGYHYL I 
THLFKLCLVP LMAVLVTEIS RLTTDDLYQI WLHLQYNLVA FIFLSALAIF GSTVYIMSRP RSVYLVDYSC 
YLPPESLQVK YQKFMDHSKL IEDFNESSLE FQRKILERSG LGEETYLPEA LHCIPPRPTM MAAREESEQV 
MFGALDKLFE NTKINPRDIG VLWNCSLFN PTPSLSAMIV NKYKLRGNVK SFNLGGMGCS AGVISIDLAK 
DMLQVHRNTY AVWS TEN I T QNWYFGNKKA MLIPNCLFRV GGSAILLSNK GKDRRRS KYK LVHTVRTHKG 
AVE KAFNC VY QEQDDNGKTG VSLSKDLMAI AGE ALKANI T TLGPLVLPIS EQILFFMTLV TKKLFNSKLK 
PYIPDFKLAF DHFCIHAGGR AVIDELEKNL QLSQTHVEAS RMTLHRFGNT SSSSIWYELA YIEAKGRMKK 
GNRVWQIAFG SGFKCNSAVW VALNNVKPSV SSPWEHCIDR YPVKLDF 



EL5 
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EL6 1502 bases 

TCTCCGACGATGCCTCAGGCACCGATGCCAGAGTTCTCTAGCTCGGTGAAGCTCAAGTACGTGAAACTTGGTTACCAA 
TATTTGGTTAACCATTTCTTGAGTTTTCTTTTGATCCCGATCATGGCTATTGTCGCCGTTGAGCTTCTTCGGATGGGT 
CCTGAAGAGATCCTTAATGTTTGGAATTCACTCCAGTTTGACCTAGTTCAGGTTCTATGTTCTTCCTTCTTTGTCATC 
TTCATCTCCACTGTTTACTTCATGTCCAAGCCACGCACCATCTACCTCGTTGACTATTCTTGTTACAAGCCACCTGTC 
ACGTGTCGTGTCCCCTTCGCAACTTTCATGGAACACTCTCGTTTGATCCTCAAGGACAA.GCCTAAGAGCGTCGAGTTC 
CAAATGAGAATCCTTGAACGTTCTGGCCTCGGTGAGGAGACTTGTCTCCCTCCGGCTATTCATTATATTCCTCCCACA 
CCAACCATGGACGCGGCTAGAAGCGAGGCTCAGATGGTTATCTTCGAGGCCATGGACGATCTTTTCAAGAAAACCGGT 
CTTAAACCTAAAGACGTCGACATCCTTATCGTCAACTGCTCTCTTTTCTCTCCCACACCATCGCTCTCAGCTATGGTC 
ATCAACAAATATAAGCTTAGGAGTAATATCAAGAGCTTCAATCTTTCGGGGATGGGCTGCAGCGCGGGCCTGATCTCA 
GTTGATCTAGCCCGCGACTTGCTCCAAGTTCATCCCAATTCAAATGCAATCATCGTCAGCACGGAGATCATAACGCCT 
AATTACTATCAAGGCAACGAGAGAGCCATGTTGTTACCCAATTGTCTCTTCCGCATGGGTGCGGCAGCCATACACATG 
TCAAACCGCCGGTCTGACCGGTGGCGAGCCAAATACAAGCTTTCCCACCTCGTCCGGACACACCGTGGCGCTGACGAC 
AAGTCTTTCTACTGTGTCTACGAACAGGAAGACAAAGAAGGACACGTTGGCATCAACTTGTCCAAAGATCTCATGGCC 
ATCGCCGGTGAAGCCCTCAAGGCAAACATCACCACAATAGGTCCTTTGGTCCTACCGGCGTCAGAACAACTTCTCTTC 
CTCACGTCCCTAATCGGACGTAAAATCTTCAACCCGAAATGGAAACCATACATACCGGATTTCAAGCTGGCCTTCGAA 
CACTTTTGCATTCACGCAGGAGGCAGAGCGGTGATCGACGAGCTCCAAAAGAATCTACAACTATCAGGAGAACACGTT 
GAGGCCTCAAGAATGACACTACATCGTTTTGGTAACACGTCATCTTCATCGTTATGGTACGAGCTTAGCTACATCGAG 
TCTAAAGGGAGAATGAGGAGAGGCGATCGCGTTTGGCAAATCGCGTTTGGGAGTGGTTTCAAGTGTAACTCTGCCGTG 
TGGAAGTGTAACCGTACGATTAAGACACCTAAGGACGGACCATGGTCCGATTGTATCGACCGTTACCCTGTCTTTA7T 
CCCGAAGTTGTCAAACTCTA 

EL6 
FIGURE 13 
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EL6 protein sequence 

Molecular Weight 56687.90 Daltons 

500 Amino Acids 

59 Strongly Basic (+) Amino Acids (K,R) 
46 Strongly Acidic (-) Amino Acids (D, E) 
182 Hydrophobic Amino Acids (A, I , L, F, W, V) 
127 Polar Amino Acids (N, C, Q, S, T, Y) 
8.909 Isolectric Point 
14.567 Charge at PH 7.0 



SPTMPQAPMP 
LCSSFFVIFI 
GEETCLPPAI 
KYKLRSNIKS 
AAAIHMSNRR 
IGPLVLPASE 
MTLHRFGNTS 
PVFIPEWKL 



EFSSSVKLKY 
STVYFMSKPR 
HYIPPTPTMD 
FNLSGMGCSA 
SDRWRAKYKL 
QLLFLTSLIG 
SSSLWYELSY 



VKLGYQYLVN 
TIYLVDYSCY 
AARSEAQMVI 
GLISVDLARD 
SHLVRTHRGA 
RKIFNPKWKP 
IESKGRMRRG 



HFLSFLLIPI 
KPPVTCRVPF 
FEAMDDLFKK 
LLQVHPNSNA 
DDKSFYCVYE 
YIPDFKLAFE 
DRVWQIAFGS 



MAIVAVELLR 
ATFMEHSRLI 
TGLKPKDVDI 
IIVSTEIlTP 
QEDKEGHVG I 
HFCIHAGGRA 
GFKCNSAVWK 



MGPEEILNVW 
LKDKPKSVEF 
LIVNCSLFSP 
NYYQGNERAM 
NLS KDLMAI A 
VIDELQKNLQ 
CNRTIKTPKD 



NSLQFDLVQV 
QMRILERSGL 
TPSLSAMVIN 
LLPNCLFRMG 
GEALKANITT 
LSGEHVEASR 
GPWSDCIDRY 
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EL7 1548 bases 

ATGGACGGTGCCGGAGAATCACGACTCGGTGGTGATGGTGGTGGTGATGGTTCTGTTGGAGTTCAGATCCGACAAACA 
CGGATGCTACCGGATTTTCTCCAGAGCGTGAATCTCAAGTATGTGAAATTAGGTTACCATTACTTAATCTCAAATCTC 
TTGACTCTCTGTTTATTCCCTCTCGCCGTTGTTATCTCCGTCGAAGCCTCTCAGATGAACCCAGATGATCTCAAACAG 
CTCTGGATCCATCTACAATACAATCTGGTTAGTATCATCATCTGTTCAGCGATTCTAGTCTTCGGGTTAACGGTTTAT 
GTTATGACCCGACCTAGACCCGTTTACTTGGTTGATTTCTCTTGTTATCTCCCACCTGATCATCTCAAAGCtCCTTAC 
GCTCGGTTCATGGAACATTCTAGACTCACCGGAGATTTCGATGACTCTGCTCTCGAGTTTCAACGCAAGATCCTTGAG 
CGTTCTGGTTTAGGGGAAGACACTTATGTCCCTGAAGCTATGCATTATGTTCCACCGAGAATTTCAATGGCTGCTGCT 
AGAGAAGAAGCTGAACAAGTCATGTTTGGTGCTTTAGATAACCTTTTCGCTAACACTAATGTGAAACCAAAGGATATT 
GGAATCCTTGTTGTGAATTGTAGTCTCTTTAATCCAACTCCTTCGTTATCTGCAATGATTGTGAACAAGTATAAGCTT 
AGAGGTAACATTAGAAGCTACAATCTAGGCGGTATGGGTTGCAGCGCGGGAGTTATCGCTGTGGATCTTGCTAAAGAC 
ATGTTGTTGGTACATAGGAACACTTATGCGGTTGTTGTTTCTACTGAGAACATTACTCAGAATTGGTATTTTGGTAAC 
AAGAAATCGATGTTGATACCGAACTGCTTGTTTCGAGTTGGTGGCTCTGCGGTTTTGCTATCGAACAAGTCGAGGGAC 
AAGAGACGGTCTAAGTACAGGCTTGTACATGTAGTCAGGACTCACCGTGGAGCAGATGATAAAGCTTTCCGTTGTGTT 
TATCAAGAGCAGGATGATACAGGGAGAACCGGGGTTTCGTTGTCGAAAGATCTAATGGCGATTGCAGGGGAAACTCTC 
AAAACCAATATCACTACATTGGGTCCTCTTGTTCTACCGATAAGTGAGCAGATTCTCTTCTTTATGACTCTAGTTGTG 
AAGAAGCTCTTTAACGGTAAAGTGAAACCGTATATCCCGGATTTCAAACTTGCTTTCGAGCATTTCTGTATCCATGCT 
GGTGGAAGAGCTGTGATCGATGAGTTAGAGAAGAATCTGCAGCTTTCACCAGTTCATGTCGAGGCTTCGAGGATGACT 
CTTCATCGATTTGGTAACACATCTTCGAGCTCCATTTGGTATGAATTGGCTTACATTGAAGCGAAGGGAAGGATGCGA 
AGAGGTAATCGTGTTTGGCAAATCGCGTTCGGAAGTGGATTTAAATGTAATAGCGCGATTTGGGAAGCATTAAGGCAT 
GTGAAACCTTCGAACAACAGTCCTTGGGAAGATTGTATTGACAAGTATCCGGTAACTTTAAGTTAT 
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EL7 protein sequence 

Molecular Weight 57848.80 Daltons 

516 Amino Acids 

59 Strongly Basic <+) Amino Acids (K,R) 
48 Strongly Acidic (-) Amino Acids (D,E) 
189 Hydrophobic Amino Acids (A, I , L, F, W, V) 
131 Polar Amino Acids (N, C, Q, S , T, Y) 
8.872 Isolectric Point 
12.792 Charge at PH 7.0 
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